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Abstract 


This  paper  develops  an  equilibrium  theory  for  two-person  two-criteria 
stochastic  decision  problems  with  static  information  patterns,  wherein  the 
decision  makers  (DM’s)  have  different  probabilistic  models  of  the  underlying 
process,  the  objective  functionals  are  quadratic  and  the  decision  spaces  are 
general  inner-product  spaces.  Under  two  different  modes  of  decision  making 
(viz.  symmetric  and  asymmetric),  sufficient  conditions  are  obtained  for  the 
existence  and  uniqueness  of  equilibrium  solutions  (stable  in  the  former  case),  and  in 
each  case  a  uniformly  convergent  iterative  scheme  is  developed  whereby  the  equilibrium 
policies  of  the  DM's  can  be  obtained  by  evaluating  a  number  of  conditional 
expectations.  When  the  probability  measures  are  Gaussian,  the  equilibrium  solution 
is  linear  under  the  symmetric  mode  of  decision  making,  whereas  it  is  generically 
nonlinear  in  the  asymmetric  case,  with  the  linear  structure  prevailing  only  in 
some  special  cases  which  are  delineated  in  the  paper. 
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1.  Introduction 

A  team  is  defined  as  a  group  of  agents  who  work  together  in  a 
coordinated  effort,  in  a  possibly  hostile  and  uncertain  environment,  in  order 
to  achieve  a  common  goal.  In  achieving  this  goal,  the  members  of  the  team 
do  not  necessarily  acquire  the  same  information,  and  hence  they  have  to  operate 
in  a  decentralized  mode  of  decision  making.  The  scientific  approach  to 
formulation  and  analysis  of  team  problems  has  involved  (i)  a  quantification  of 
the  underlying  common  goal  in  the  form  of  a  (mathematical)  objective  function 
which  is  sought  to  be  optimized  jointly  by  the  agents,  and  (ii)  a  modeling 
of  the  uncertain  environment  and  the  possible  measurements  made  by  the  agents 
on  this  environment  in  the  form  of  a  probability  space  together  with  an 
appropriate  information  structure  [14,7,15,16].  The  underlying  stipulation  here 
has  been  the  existence  of  a  probability  space  that  is  common  to  all  the  agents, 
so  that  through  their  priors  all  members  of  the  team  "see  the  world"  in  exactly 
the  same  way. 

One  question  that  readily  comes  into  mind  at  this  point  is  the 
robustness  of  such  a  mathematical  model,  and  the  "optimum"  solutions  it  produces, 
to  slight  variations  in  the  underlying  assumptions.  In  particular,  what  if  the 
agents  perceive  the  outside  world  in  slightly  different  ways?  Would  the 
solution  obtained  under  the  assumption  of  common  prior  probability  measures 
change  drastically  if  there  are  discrepancies  in  the  agents'  perceptions  of  the 
probabilistic  description  of  the  outside  world?  In  order  to  be  able  to  answer 
these  queries  satisfactorily  and  effectively,  we  need  a  theory  of  equilibrium 
for  decision  problems  in  which  the  decision  makers  (DM's)  have  different 
probabilistic  models  of  the  system;  such  a  general  theory  will  clearly  subsume 
the  currently  available  results  on  teams  which  use  a  common  probability  space. 


Consider  a  static  team  decision  problem,  formulated  in  the  standard 
manner  as  in  [7],  with  the  only  difference  being  in  the  underlying  probability 
space.  In  particular,  assume  that  the  DM's  assign  different  subjective 
probabilities  to  the  uncertain  events,  in  which  case  there  will  not  exist  a 
common  probability  space,  thereby  leading  to  a  different  expected  (average) 
cost  function  for  each  DM.  Hence,  once  we  relax  the  assumption  of  existence 
of  a  common  probability  space,  the  team  problem  is  no  longer  a  stochastic 
optimization  problem  with  a  single  objective  functional,  and  we  inevitably 
have  to  treat  it  as  a  nonzero-sum  stochastic  game  [5,8,12].  Furthermore,  even 
though  the  original  team  decision  problem  with  a  common  probability  space 
will  admit  the  same  team-optimal  solution(s)  regardless  of  the  mode  of 
decision  making  (that  is,  regardless  of  whether  the  roles  of  the  DM's  are 
symmetric  or  whether  there  is  a  hierarchy  and  dominance  in  decision  making) , 
this  feature  ceases  to  hold  true  when  there  exists  a  discrepancy  between  the 
perceived  probability  measures.  When  there  are  only  two  members,  for  example, 
two  possibilities  emerge  in  the  presence  of  discrepancies:  the  totally 
symmetric  roles,  corresponding  to  the  Nash  equilibrium  solution,  and  the 
hierarchical  mode, corresponding  to  the  Stackelberg  equilibrium  solution. 

Motivated  by  these  considerations,  we  treat  in  this  paper  a  more 
general  (than  team)  class  of  two-person  stochastic  decision  problems  which 
can  be  viewed  as  static  stochastic  nonzero-sum  games  with  the  DM's  having 
different  subjective  probability  measures.  Adopting  both  the  symmetric  and 
asymmetric  modes  of  decision  making,  we  develop  in  each  case  a  general 
theory  of  equilibrium  when  the  objective  functionals  are  quadratic  and  the 


rvv 


decision  spaces  are  appropriate  Hilbert  spaces.  Such  a  formulation  includes  both 
finite-dimensional  (discrete)  and  continuous-time  decision  problems,  and  involves 
arbitrary  probability  measures  which  are,  though,  restricted  a  posteriori  by  the 
conditions  of  existence  and  uniqueness  developed  in  the  paper.  The  special  case  of 
Gaussian  distributions  is  studied  in  considerable  depth,  and  some  explicit  solutions 
are  obtained  with  appealing  features. 

The  organization  of  the  paper  is  as  follows.  The  next  section  (§2) 
provides  a  precise  problem  formulation,  and  introduces  the  two  solution  concepts 
adopted  in  the  paper.  Section  3  develops  general  conditions  for  existence  and 
uniqueness  of  a  stable  equilibrium  solution  under  the  symmetric  mode  of  decision 
making,  and  elucidates  the  extent  of  the  restrictions  imposed  on  the  problem  by 
these  conditions.  Section  4  presents  a  counterpart  of  the  results  of  Section  3 
under  the  asymmetric  mode  of  decision  making,  with  the  mathematical  machinery 
used  being  inherently  different  from  that  of  §3.  Section  5  deals  with  the  special 
class  of  Gaussian  distributions,  under  both  symmetric  and  asymmetric  modes  of 
decision  making.  In  the  former  case  it  is  shown  that  the  unique  stable  equilibrium 
solution  is  affine  in  the  measurements  and  can  be  obtained  explicitly.  In  the 
latter  case,  however,  the  solution  is  generically  nonlinear,  and  contains  summation 
of  terms  which  involve  products  of  linear  functions  of  measurements  with  exponential 
terms  (whose  exponents  are  quadratic  in  the  measurements) .  The  section  also  contains 
some  discussion  on  finite-dimensional  and  continuous-time  problems,  treated  as 
special  cases.  Section  6  is  devoted  to  discussions  on  possible  extensions  of 
these  results  in  different  directions,  provides  some  interpretation  of  the 
general  approach  and  results,  and  includes  some  concluding  remarks.  The  paper  ends 
with  five  Appendices  which  include  results  used  in  the  main  body  of  the  paper. 


Mathematical  Formulation  and  Some  Basic  Results 


2.1.  Pvobabi lizy  Spaces 


ml  „”2  4 


Let  ft  =  E  x  E  x  E  =  X  x  x  B  denote  the  Borel  field 

Ic 

of  subsets  of  ft,  and  B  denote  the  Borel  field  of  subsets  of  1  ,  k  =  n,  m^,  . 

Let  P  denote  the  set  of  all  probability  measures  on  (ft,B)  with  finite  second 

moments,  and  for  each  P€ P  denote  the  corresponding  marginal  measures  on 
n  ml  m2 

B  ,  B  and  B  by  P  ,  P  and  P  ,  respectively.  Furthermore,  let  the 

X  yi  y2 

collection  of  all  such  probability  measures  be  denoted  by  P  ,  P  and  P  , 

x  y^  y2 

respectively.  Then,  for  each  PS  P,  the  vector  z  =  (x%  y',  y')',  taking  values 

in  ft,  becomes  a  well-defined  random  vector  on  (ft,B,P),  and  likewise  x  is  a 

m.  m. 

random  vector  on  (E  ,  B  ,P  )  and  y .  is  a  random  vector  on  (E  ,B  ,P  ). 

yi 

Here,  x  denotes  the  unknown  state  of  Nature,  and  y^  denotes  an 

observation  of  DMi  (i'th  decision  maker)  which  is  correlated  with  x.  We  now 

1  2 

choose  two  elements  out  of  P,  P  and  P  ,  which  denote  the  subjective  probabilities 

assigned  to  z  by  DMI  and  DM2,  respectively.  For  technical  reasons,  we  place 

1  2 

some  further  restrictions  on  the  choices  of  P  and  P  through  the  marginals 


Pv  ;  in  particular  we  assume  that 

J  , .  .  1  2  2 

Conaztzon  (1). P  and  P  are  absolutely  continuous  [1]  with  respect  to  P 

1  y2  yl  y: 

and  P  ,  respectively;  that  is,  using  the  standard  notation  in  probability 

yl 

theory , 


<  <  P 


<  <  P 


Condition  .2).  The  Radon-Nikodvm  (R-N)  derivative  [1] 


!x(o  =  dp i  /  ,  j^i 


is  uniformly  bounded  a.e.  P1  .  i=l  ? 


The  necessity  of  these  two  conditions  in  the  formulation  of  our  problem  will  be 


made  clear  in  the  sequel.  We  should  note,  however,  that  for  the  special  case 
1  2 

when  P  is  equivalent  to  P  ,  both  of  these  conditions  are  satisfied  (in  the  latter 
case  the  bound  is  equal  to  1)  and  we  have  the  standard  decision  theoretic  framework 
[2]  with  a  single  probability  space. 

2.2.  Decision  and  Policy  Spaces 

The  decision  variable  of  DMi  will  be  denoted  by  u^  which  belongs  to  a 
real  separable  Hilbert  space  IL  with  inner  product  (',‘)^>  Permissible  policies 
(decision  rules)  for  DMi  are  measurable  mappings 


Y±:  M  U.  ,  /llY.U)ll  l  Py  (dC)  <  - 


where 


.  is  the  natural  norm  derived  from  (*,*)••  Let  T.  denote  the  space  of 
i  ’ll 


all  such  policies,  which  is  further  equipped  with  the  inner  product 


<  Y > 3  >.  =  /  (Y(0,B(C».  P*  (d?) 
1  Y±  Yi 


Then,  we  have  the  following  two  results  the  first  of  which  is  standard  [3]  and 
the  second  one  involves  a  change  of  measures  using  the  R-N  derivative. 

Lemma  1.  r\  is  a  Hilbert  space.  D 

Lemma  2 .  If  Conditions  (1)  and  (2)  are  satisfied,  every  element  of  T  has 
bounded  second-order  moments  also  under  P^  ,  j^i. 


i 

Let  D„:  Uj  **■  Uj  (i/j,  i,j  =  l,2)  be  strongly  positive  bounded  linear 
operators,  and  F^:  X  -*■  U  be  bounded  linear  operators  for  all  i,j=l,2. 
Furthermore,  let  E1 [y1  (z) | y^ ]  denote  the  mathematical  expectation  of  a 


That  is,  there  exists  a  %  0  such  that  (u,D..u).  >  a(u,u).  for  all 


z-measurable  random  variable  y  (z)  taking  values  in  Ih  conditioned  on  the 
random  variable  y  ,  and  under  the  probability  measure  p1,  i.e. 

E1[y(z)|y.]  =  /y(z)pf  (dzlv.)  (5) 

1  a  lYi  ; 

where  the  second  term  of  the  integrand  is  the  conditional  probability  measure 

i  t 

derived  from  P  .  Then,  for  each  pair  Cy1,Y2>  €  ?1  x  T  ,  we  have  a  quadratic  expected; 

cost  functional  for  each  DM,  defined  for  DMi  by 

VvV  "  7  <Yi’Vi  +  2  [  (Yj(5)’  DjjV0)j  pj.(d°  ■  <Yi’  EitFixiyi]>i 

j  j  (6) 

-  /  (Y AO,  F*x).P1(dx,Y  ,d?)  -  <Y,,  Y  (y  )|y  ]>. 

XxY.  J  j  2  1  3  3  J 


every  term  of  which  can  be  shown  to  be  finite,  in  view  of  Lemmas  1  and  2.  Note 
that  in  the  absence  of  Conditions  (1)  and  (2),J±  is  not  necessarily  finite  and 
hence  the  problem  is  not  well  defined. 

It  is  worth  mentioning  here  that  describes  a  most  general  type  of 
quadratic  cost  functional  which  is  strictly  convex  in  u^  and  that  the  formulation 
here  covers  also  the  cases  of  team  problems 


(“j,  - 1,  dJ2  -  d£;  fJ  -  F? 


=  i^j)  and 


sum  games; 


*  1 

Jj  °12  ~  ~  °21’  Fi  Fi>  1>1=1<2>  But  even  in  these  "single 

loss- functional"  ptoblems,  the  DM's  will  have  inherently  different  expected 
cost  functions  whenever  P  and?2  are  different,  since  then  a  common  probability 
space  does  not  exist.  This  forces  us  to  formulate  the  problem  as  a  multi¬ 
criteria  optimization  problem  and  introduce  equilibrium  solution  concepts  that 
would  be  appropriate  in  this  framework. 


A  superscript  (*)  designates  the  adjoint  of  a  given  linear  operator 
defined  on  a  Hilbert  space,  and  1  designates  the  identitv  operator. 
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2.4.  Equilibrium  Solution  Under  the  Symmetric  Mode  of  Decision  Making 

Since  the  expected  cost  functionals  (6),  together  with  the  policy  spaces, 
provide  a  normal  (strategic)  form  description,  regardless  of  the  presence  of 
multiple  probability  measures,  the  standard  definition  of  noncooperative  (Nash) 
equilibrium  [5]  remains  intact,  which  is  the  most  reasonable  solution  concept 
here  under  the  symmetric  mode  of  decision  making. 

Definition  1.  A  pair  of  policies  (y^.Y^)  S  ^1  x  ^2  constitutes  a  Nash  equilibrium 
solution  if 


Remark  1.  The  notion  of  stable  equilibrium  makes  particular  sense  (and  is  of 
paramount  importance)  in  decision  problems  wherein  the  DM's  have  different  priors 
on  the  uncertain  quantities,  because  it  is  determined  as  the  outcome  of  a  natural 
iterative  process.  In  this  process,  each  DM  responds  optimally  (using  his  priors) 
to  the  most  recent  decision  (policy)  of  the  other  DM,  with  the  priors  on  which 
this  decision  is  based  being  irrelevant.  In  other  words,  even  though  the  computation 
of  the  Nash  equilibrium  solution  will  depend  on  the  different  prior  probability 


measures  perceived  by  two  DM's,  in  the  iterative  procedure  that  leads  to  this 
equilibrium  each  DM  has  to  know  only  his  own  prior  and  the  other  one's  announced 
policy  at  the  previous  step.  For  an  earlier  utilization  of  this  concept  in  a 
deterministic  setting  we  refer  the  reader  to  [28],  □ 

2.5.  Equilibrium  Solution  Under  an  Asymmetric  Mode  of  Decision  Making 

In  the  case  of  the  asymmetric  mode  there  is  a  hierarchy  in  decision 

making,  which  permits  one  DM  (say  DM1 — leader)  to  announce  and  enforce  his  policy 

on  the  other  DM  ( follower ) .  The  relevant  solution  concept  here  is  the  leader- 

follower  (Stackelberg)  solution  which  is  introduced  below. 

s  s  _ 

Definition  3.  A  pair  of  policies  (y^,Y9)  S  x  constitutes  a  leader-follower 
(Stackelberg)  equilibrium  solution  with  unique  follower  responses,  if  there  exists 
a  unique  mapping  -*■  satisfying 

J2(y1,T2[y1])  -  J2(y1’y2)  ’  V(y1’y2)  £  P1  X  F2  (10) 
and  furthermore 

J1(Y1’T2[Y1])  -  J1(Y1’T2[Y1])  ’  Vyl  S  F1  (11) 

with 

S  _  ,  S  , 

Y2  ~  T2  Y1 

Remark  2 .  The  uniqueness  condition  on  T^  is  satisfied  in  our  case,  because  J0  is 
strictly  convex  (and  quadratic)  in  y9-  c 

Remark  3.  The  solution  introduced  above  may  not,  at  first  glance,  appear  to  be  an 
equilibrium  solution,  because  of  the  strict  ordering  of  the  DM's.  However,  it  can 
be  shown,  by  following  an  argument  first  developed  in  [17],  that  the  Stackelberg 
solution  can  be  viewed  as  the  so-called  "strong  equilibrium"  of  a  decision  problem 
with  a  modified  (dynamic)  information  pattern  [see  Appendix  E] .  r 
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.  General  Conditions  for  a  Stable  Equilibrium  Solution 


Under  the  Symmetric  Mode 

We  now  obtain  some  general  conditions  for  existence  of  stable  equilibrium 
solutions  under  the  symmetric  mode  of  decision  making,  and  also  consider  some 
special  cases  when  the  probability  measures  of  both  DM's  are  absolutely  continuous 
with  respect  to  the  Lebesgue  measure  (i.e.  when  densities  exist).  Firstly  we  have 
Proposition  1.  A  pair  of  policies  (y^jY^)  €  x  constitutes  a  Nash  equilibrium 
solution  to  the  decision  problem  of  §2,  if,  and  only  if,  it  satisfies  the  pair  of 
equations  (under  the  notation  of  (5)): 


Y°(yx)  =  d}2  I  yx  1  +  E1  [x |  yx ] 


y2 (y2^  =  °2i  E2[Y°(yx) |y23  +  F2  E2 [x [ y2 ] . 


(12a) 


(12b) 


Proof.  This  result  follows  from  a  simple  minimization  of  the  two  quadratic  forms 
^1  (Y1  ’ Y^)  and  J2^Y1’Y2^  °n  the  tW°  Hi^ert  spaces  and  T respectively,  and  by 
virtue  of  the  fact  that  these  two  quadratic  forms  are  positive  definite  in  the 
relevant  variables.  a 

By  the  same  argument  used  in  the  proof  of  Proposition  1,  relations  (9a) 
and  (9b)  in  Def.  2  can  equivalently  be  written  as 


Yi°°  =  D^2  E1[Y^k_1)(y2)|y1J  +  E^xjy^ 


(13a) 


Y2  )  D21  E2[Yik  1)(y1)|y2]  +  F2  E2[x|y2)  ,  k=l,2,  ....  (13b) 

Now,  substituting  (13b)  into  (13a),  and  also  (13a)  into  (13b),  by  appropriately 
matching  the  superscripts,  we  arrive  at  the  following  two  recursive  relations: 

Yik)(yi}  “  DiiDji  e1leJ [v^k_2) (yi)|yj]!yi]  +  fJ  Ei[x|y.]  , 


+  E^E3  [x|y^  ]  |  y±  ]  ,  j  ,  i=l ,  2  ;  j/i;  k=2,4,...  or  k=3,5, 


Note  that  if  the  recursive  scheme  (14)  converges  for  even  values  of  k,  it  also 
converges  (to  the  same  limit)  for  odd  values  of  k  [this  follows  from  expressions 
(13a)-(13b) ] .  Hence,  we  confine  attention  only  to  even  values  of  k  and  obtain  the 
following  result  as  a  direct  consequence  of  the  foregoing  analysis: 

Proposition  2.  A  pair  of  policies  (y^.Y^)  e  x  ^  constitutes  a  stable  Nash 
equilibrium  solution  if,  and  only  if,  for  all  (y^.Y^)  €  x  r2’ 

Yi(yi)  =  iim  Yi2k^yi^  in  Fi  »  (15) 

k-*» 

(2k) 

where  y)  ,  k=l,2,...,  is  given  recursively  by  (14).  Furthermore,  such  a  stable 
equilibrium  solution  is  necessarily  unique.  D 

Let  us  now  introduce  linear  operators  £  :  r\  i=l,2,  by 

5±(y)  -  d^d^.  Ei[Ej[Y(yi)lyj]|yi]  ,  j*i;  i,j-l,2.  (16) 

Note  that  SL  indeed  maps  into  F^,  because  the  conditional  expectation 

[Dj  iy  (yi)  |  yj  ]  maps  F^  into  F^.  (j^i)  when  the  probability  measures  satisfy 

Conditions  (1)  and  (2),  and  every  element  of  is  square-integrable  under  both  P 

and  PJ  (cf.  Lemma  2). 
yi 

Furthermore,  let  us  introduce  the  notation  <<£>>^  to  denote  the  norm 
of  a  linear  bounded  operator  ■*  7  ,  which  is  defined  by 

«$».  =  sup  [<5Y,5Y>i/<Y,Y>,  ]1/2  ,  (17a) 

y€r .  1 

l 

and  ri(5)  to  denote  the  spectral  radius  of  5,  which  is  defined  by  [see  Appendix  A] 

r.(S)  =  lim  sup  [«Sk».]1/k 
1  k-*»  1 

where  5k  denotes  the  k'th  power  of  $.  Finally,  let  us  introduce  the  linear 
operators 


*,  ; 


(17b) 


11 


D1  =  .  D-?  . 

ij  Ji 


(18a) 


pi| i  =  e±[ej  t  * |yj lyil  , 


(18b) 


both  of  which  map  I\  into  itself  (the  former  also  maps  Ih  into  itself).  Then, 
the  following  Proposition,  whose  proof  depends  on  a  contraction  mapping 
argument  (see  Appendix  B) ,  provides  a  set  of  necessary  and  sufficient  conditions 
for  existence  of  the  unique  equilibrium  solution  alluded  to  in  Prop.  2. 


Theorem  1.  (i)  Under  Conditions  (1)  and  (2),  the  decision  problem  of  Section  2  admits 
a  unique  stable  Nash  equilibrium  solution  given  by  (15)  if,  and  only  if,  there  exists, 
for  at  least  one  1*1,2,  a  p1,  0<p1<l,  such  that 

r.03.)  =  r.^P.,.)  <  o1  .  (19) 


(ii)  A  set  of  sufficient  conditions  for  (19)  to  hold  true  is  the  existence 


of  a  pair  of  positive  scalars  (p^,p2),  such  that 
P^2<1  ,  r1(D1)  <  p*  ,  ri(?i|i)  1 


(20a) 


Furthermore,  a  set  of  sufficient  conditions  for  the  latter  two  is 


<<D  >>.  =  li  D  II  .  <  p.  ,  «P.|.>>.  <  p„ 
i  1  -  1  i|i  1  -  2 

where  N  •  II  denotes  the  operator  norm  on  U^,  as  a  counterpart  of  (17a). 


(20b) 


Proof.  See  Appendix  B.  □ 

Part  (ii)  of  Thm.  1  provides  a  partial  separation  (in  terms  of  sufficient 
conditions)  of  the  deterministic  and  stochastic  parts  of  the  system.  Now,  if 
the  decision  problem  is  a  team  problem  with  a  common  loss  functional  [which 
requires  D22  =  I,  D^2  =  D21’  F1  =  F1  anc^  F2  =  ’  anc*  team  cost  is  strictly 


convex  in  the  pair  [which  is  true  if  and  only  if  1  =  P  <  1],  it 

1  2 

follows  that  the  first  inequality  holds  with  <  1.  If,  furthermore,  the 

subjective  probability  measures  assigned  to  the  pair  by  the  two  DM's  are 

equivalent,  P^|^  becomes  the  product  of  two  projection  operators,  thus  leading 

1  2 

to  satisfaction  of  the  second  inequality  in  (20b)  with  p^  =  O2  =  1>  and  thereby  to 

satisfaction  of  (20a).  Hence,  as  a  corollary  to  the  second  part  of  Prop.  3,  we 

obtain  the  following  result  which  is  known  in  different  contexts  [7,8,9]. 

Corollary  1.  For  the  strictly  convex  quadratic  team  problem  with  equivalent  subjectiv 

probability  measures  assigned  by  the  two  DM's  to  ^^^2)  >  there  exists  a  unique  stable 

equilibrium  solution  (the  so-called  team-optimal  solution) ,  irrespective  of  the 

underlying  common  probability  measure.  Q 

1  2 

For  team  problems  with  P  ^P  ,  a  result  along  the  lines  of  Corollary  1  does 

not  in  general  hold,  because  the  operator  P^ | ^  is  not  necessarily  the  product  of  two 

projection  operators.  Then,  the  general  condition  is  (19)  [or  the  stronger  one,  (20a) 

which  places  some  restrictions  on  the  parameters  of  the  cost  functional,  as  well  as 

1  2 

the  probability  measures  P  and  P  .  To  delineate  the  extent  of  these  restrictions, 
we  now  study  the  second  inequality  of  (20b)  somewhat  further  and  obtain  the  following 
sufficient  condition. 

Corollary  2.  For  a  given  p^,  the  second  inequality  of  (20b)  is  satisfied  if  the 


expression 


g1(yi)Ej [gj (y  ) |yiJ  =  g1(y1)/  gj(n)P^  1  (dn|?  -  y.) 

J  Y,  yj  1  yi 


(21a) 


is  uniformly  bounded  from  above  by  (p„)  a.e.  P1  .  Furthermore,  if  the  probability 

^  -v  i 

12  1 

measures  P  and  P  are  absolutely  continuous  with  respect  to  the  Lebesgue  measure, 
this  condition  can  be  expressed  equivalently  in  terms  of  the  probability  densities 
p^(y^,yj)  as  follows: 

This  result  is  slightly  more  general  than  the  related  ones  that  can  be 
found  in  [7,8,9],  since  here  pi  is  allowed  to  be  different  from  p£,  though  still  a 
restriction  is  imposed  on  these  (indirectly)  via  the  equivalence  between  P-*-  and 


14 


,  ^ 

iuJ 


4 .  General  Sufficient  Conditions  for  a  Stackelberg  V> 

Equilibrium  Solution 

We  now  turn  our  attention  to  the  asymmetric  mode  of  decision  making,  " 

obtain  some  general  sufficient  conditions  for  existence  of  a  Stackelberg  equilibrium 

-  \ 

solution,  and  provide  a  complete  characterization  of  the  solution.  Subsequently  we  -.j 

consider  some  special  cases  with  some  further  structure  imposed  on  the  cost  functional^ 

and  the  probability  measures.  J 

Firstly  we  obtain  an  expression  for  DM2's  unique  reaction  T2 :  T  -*  F^,  as  \] 

j 

defined  by  (10),  using  Prop.  1: 


W  =  Y2(^2)  =  D21E  ^(yj.)!^1  +  F2E2[xly2]  • 


Hence,  the  derivation  of  the  leader's  Stackelberg  policy  y^SF^  involves  (in  view  of  £3 
(11))  the  minimization  of  over  after  y2  given  by  (22)  is  substituted  in.  This 
substitution  yields  ' 


J(y)  =  J1(y,y°)  “  ~  <Y»Y>1  +  j  /  (F2E2[x|y2] 

Y2 

+  D21E2[y(y1) |y2] ,d22d21E2 [Y(yx) |y2]  +  D22F2E2[x(y2])2  pJ  (d5) 
-  <y ,  E1[Fxx|y1]>1  +  /  (D21E2[Y(y1) |y2]  +  F“E2[x|y2],  F2  x)  2 


.  Pi(dx,Y1,d5) 

-  <Y,  E1[DX2D21E2[Y(y1)|y2]|y13  + e1[dJ2f2e2[x|v2] |y1]>1. 


where  we  have  deleted  the  subscript  1  in  y^  in  order  to  simplify  the  notation. 
Now,  since  is  a  linear  space,  and  J  is  the  sum  of  terms  homogeneous  of  degree 
zero,  one  and  two  (maximum),  any  minimizing  solution  Ye"x  will  have  to  satisfy 


J  XJ  V/  'V  J 


AJ(y  ;  h)  =  J (y+h)  -  J(y)  =  6J(y  ;  h)  +  6^J(y  ;  h)  >  0  Vh€r  , 


where  6  J(y  ;  h)  is  the  Gateaux  variation  of  J(y)  of  degree  i.  Extensive 

manipulations,  details  of  which  are  given  in  Appendix  D  (subsection  1),  lead  to 

2~ 

the  following  expressions  for  5J  and  6  J: 


5J(y  ;  h)  =  <h,Y>1  -  /  (h(yi),  (2y)(y1))1Py  (dyx) 


Y  (h(y1) ,6(y1))1Py  (dy^) 


ZJ(y  ;  h)  =  —  <h,h>1  +  j  f  (h(y1),g1(y1)E2[g2(y2)D21D22D21 

Y1 

•  E2(h(C)|y2]|y1I)1Py  (dyx)  -  <h,Dj2D21Pll  1h>1 


where  2:  and  are  defined  by 

*  * 

(2y)  (yJ )  -  <D12D21P1|  l  +  D2iDnp*|  d’O'P 


-  D21D22D21g1(y1)E2[g2(y2)E“[>  (y1) ! y 2  3 1 yx ] 

=  F1E[x|y1]  -  D^1D^9F2g1(y1)E2[g2(y2)E‘'[x|y2]  | y L ] 


(27a) 


S(y1) 


(27b) 


-  Dj2F2E1[E2[x|y2]  [yj  +  D^g1  (yj  EZ  [gZ  (y^E1  [x  |  y2  ]  |  yx  ] 


2  rl  1,.  s„2r  2,  _1, 


P  i  :  11  is  a  linear  operator  given  by 

J.  i  1  X  X 

P 1 | i Y ( y i )  =  E1 [ E~ [ y (yx) J y2 ] I y L ]  , 


Here  6  j  is  written  simply  as  *.J. 


11^  is  the  space  of  y ^-measurable  random  variables  taking  values  in  U^,  and  g  (?)  are  -> 
the  R-N  derivatives  (2).  Note  that  P ^  i  ^  is  related  to  1 1  defined  by  (18b)  by  *-*- 


pililY<yi)1  *  <piliY><yi)  :J; 

where  the  latter  (which  is  a  mapping  from  T  into  T  )  has  been  used  in  (26)  and  will  £9 

■L  1  •  ■ 

also  be  used  in  the  sequel  whenever  needed. 

Now,  since  (24)  is  also  equivalent  to  >' 

6j  (y  »h)  =  o  vner  ]  .'■] 

2~  >  >  (29) 

6  j (y  ,h)  >_  o  vher 

si 

a  Stackelberg  solution  y^F^  will  exist  for  the  leader  if,  and  only  if, 

(i)  (26)  is  nonnegative  definite, 
and  (from  (25)): 

«■ 

(ii)  y (y^)  -  (=Y)(y1)  -  B(yx)  =  0  ,  a.e.  pJ  .  (30) 

yi  -% 

"  > 

Since  the  first  of  these  conditions  does  not  depend  on  y,  the  optimal  solution 

m 

is  solely  determined  by  (30) ,  which  can  be  rewritten  as 

>(yl)  =  D12D21e1^E^  I  y2^  i  yi^  +  D21D12gl(yl)  E2tg2(y2)ElfY^yl^!y2^yl^ 


+  D21F^g1(y1)E"[g2(y2)E1[x|y2]  |y1]  ~  D2 1D2  2D2 1S  ^  ^  yl  ^ E  ^  ^ 8  ( y  2  ^ E  ^  ^ 1  ^  i  ^2  ■*  ^  y  1 -* 


2  nl  r.2  1,  NT.2r  2,__  ,  „2 , 


+  F2E1[x|y1]  -  D21D22F2g1(y1)E2[g2(y2)E“[xiy2] \y ^  +  Di2F2E  tE  [xj  y2] ! Vq]  , 


1  t-2,-1  r_2 , 


where  we  have  utilized  the  fact  that  the  adjoint  of  ,  is  a  linear  operator 

J. 

P  i  :  11  -41 ,  given  by  [see  Appendix  D,  subsection  2] 


?T.I  lY(yl}  =  /  >(n)/ 


Py1y2(dnxdy2)Py1y2(dylxdy2) 
2  Py2(d5’2)I'y1(^l> 


=  81(y1)E2tg2(y2)E1[Y(y1) |y2J jy^ 


Furtheraore,  condition  (i)  can  be  rewritten  as 

*  *  * 

*  ’  1  +  I  D21D22D21(K+K*>  -  DnD21?l|l  -  D21DLh|l  1  °  <33> 

where  I:  is  the  identity  operator,  and  K:  r^-»T^  is  defined  by 

(Ky)(y1)  =  g1(y1)E2[g2(y2)E2[y(y1) |y2] |yx]  .  (34) 

We  now  summarize  these  results  in  the  following  proposition: 

Proposition  3.  Under  Conditions  (1)  and  (2),  the  decision  problem  with  multiple 
probability  measures  admits  a  Stackelberg  equilibrium  solution  if,  and  only  if, 

A  is  nonnegative  definite  and  (31)  admits  a  solution  in  n 

Equation  (31)  will,  in  general,  not  admit  a  closed-form  solution,  even 
if  all  random  variables  are  jointly  Gaussian  distributed  (see  §5.3);  therefore, 
we  will  have  to  resort  to  numerical  computations  which  will  involve  a  recursion  of 
some  type.  Hence,  in  analyzing  the  conditions  of  existence  of  a  solution  to  (31) 
we  may  also  require  that  such  a  numerical  scheme  be  globally  convergent  (or  stable)  . 
One  appealing  scheme  whereby  a  unique  solution  to  (31)  [or,  equivalently,  (30)]  can 
be  obtained  is  the  recursion 


y  (y x )  -  (sY^  1^)(y1)  +  8(y1)  ,  k=l , 2 , . . .  (35) 

where  y^  is  chosen  as  an  arbitrary  element  of  T  .  If  the  limit  lim  y^  =  yS 

k-*» 

exists  in  for  all  such  initial  choices,  then  y  will  necessarily  constitute  a 
solution  to  (31).  A  sufficient  condition  for  this  readily  follows  from 
Lemma  B.l,  which  we  give  below  as  Prop.  4. 


Proposition  4.  In  addition  to  the  conditions  of  Prop.  3,  assume  that  there  exists 
a  scalar  p,  0<p<l,  such  that 


r<2)  <  p 


(36) 


where  r(2)  is  the  spectral  radius  of  2.  Then,  the  decision  problem  admits  a 

s  s  s 

unique  Stackelberg  equilibrium  solution  (y  ,T ^ [y  ]),  where  y  61^  is  the  limit  of 
the  iterative  scheme  (35)  ,  and  is  the  affine  operator  (22)  .  a 

We  now  further  elaborate  on  (36) ,  so  as  to  bring  it  to  a  form  which 
separates  out  the  contributions  from  the  deterministic  and  probabilistic 
components  of  the  problem.  [Here,  we  are  seeking  sufficient  conditions  which 
would  constitute  the  counterpart  of  (20)  in  this  context] .  Towards 
this  end,  let  us  first  note  that  using  (34)  in  (25a): 

*  *  * 


r (2)  =  r  ^°12D21P1 1  1 


D21D12Pl|l  D21D22D21K) 


(37) 


and  utilizing  the  inequality  relationship  between  the  spectral  radius  and  norm 
of  an  operator  (see  Appendix  A,  Lemma  A.l)  this  can  be  bounded  from  above  by 

*  *  * 


1  2  — 

<<D  D  P  i  + 
12u21  1  1 


D21D12P1 ! 1  "  D21D22D21K>>1 


where  <<  •  >>^  is  the  operator  norm  as  defined  in  (17a) .  Using  the  standard 
(triangle  inequality)  property  of  norms,  this  can  further  be  bounded  from  above 
by 


k  k  k 

-  "<D12D21Pl!l  +  D21D12P1 '  1>:>1  +  <<D21D22D21K>>  1  ' 

k 

2  12 

Now  since  both  and  K  map  a  Hilbert  space  (T  )  into  itself,  using  the 

norm  inequality  for  products  of  linear  operators,  we  further  have 

*  i  *  j  * 

-  'D12D21Pl  !l  +  D21D12P1  '  l^l  +  <':D21D22D21>>1  ''''K>>] 


=  r(D12D21Pljl  +  D21D12Pi:i}  +  r(D21D22D21)  [^CK*K>]1/2 


J 


v: 


v 
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where  the  equality  follows  because  (i)  the  spectral  radius  and  norm  of  a 
self-adjoint  linear  operator  are  equal  [13, p. 514],  (ii)  norm  of  a  "non-self-adjoint 


linear  operator  K  is  equal  to  the  square  root  of  the  spectral  radius  of  the 

* 


self-adjoint  operator  K  K  (see  Appendix  A,  Lemma  A.l).  Finally,  using  the  result 
of  Lemma  A. 2  (Appendix  A),  the  latter  is  bounded  from  above  by 


*  * 


r (2)  <  2[(Dj2D21D21Dj2)]1/2[r(P*,1P1,1]1/2  +  r(D21D22D21)  [r(K*K)]1/2.  (38) 


Now,  let  us  assume  the  following: 

Ccnaition  (3).  There  exist  four  positive  scalars  p^.p^p^p^,  satisfying 


2  PXP2  +  p3p4  <  1 


(39) 


such  that 


*  *  * 
r(D12D21D21D12)-  (pl)  ’  r(D21D22D21)-P3 


r(Pl| lPll 1^—  (p2)2 


r(K*K)<  (p4)2 


(40a) 

(40b) 


Then ,  we  have 


Theorem  2.  Under  Conditions  ( l)-(2 J  of  §2  and  Condition  (3)  given  above,  the 


decision  problem  admits  a  unique  Stackelberg  equilibrium  solution  (yS,T9 [yS] ) , 


where  y  is  the  limit  of  the  iterative  scheme  (35),  and  T2  is  given  by  (22) 


? ocof.  The  result  follows  from  Prop.  4  and  the  discussion  and  derivation 
that  leads  to  o onditvon  (3),  provided  we  show  that  the  given  three  conditions 
subsume  (33),  i.e.  nonnegativity  of  operator  A.  We  now  verify  that  Condition  ( 


in  fact  implies  that  A  is  a  strongly  positive  operator.  First  note  that  A  is 


* 

seli.-adjoint ,  because  K  commutes  with  D^D^D^.  Hence  ,  using  Lemma  A.  3 


(Appendix  A),  we  can  write  down  the  inequality 


&  ;V  iV 

r(A-I)  <  y  r(D21DP2D21(K+KK))  +  ^(D^D2^,  .  +  D^D^P*^  . 


■ 4 
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Then,  using  the  line  of  arguments  that  led  to  (38)  from  (37),  and  the  spectral  radius  ;’•* 

'4 

inequality  for  the  product  of  two  self-adjoint  operators,  we  obtain  the  bound 

*  *  it 

r(A-X)  <  f  r(D21D22D21)  r(K+K*>  +  r  f  <^Di2D2  1°2 1D12}  J 17 2  [ r (^l| 

1  * 

P 2  r(K+K  )  +  P1d2  . 


But  note  that 


77  ^ 

r<K+K  )  =  sup  [<y,(K+K  )  y>1  |  <Y  ,Y>,]  =  2  sup  [<y,Ky>  |<y,y>  ] 
^ri  yeri  1  1 


and  since,  from  the  Cauchy-Schwarz  inequality  of  inner  products, 

I<V,Kt>1|2  <  | <Y,Y>1|  HKy,K7>1|  , 


8 


7*C 

we  have  r(K+K  )  _<  2  sup  [<Ky,Ky>1  <y,y>  ] 

-  T>  X- 


1/2 


=  2  sup  [<y,K*K>1 |<y,y>J1/2  =  2[r(K*K)]1/2  <  2p 
y€rL  1  r  “  ' 

Thus,  r(A-I)  <_  p3p4  +  p^p^  <  l, 

implying  that  the  spectrum  of  the  self-adjoint  operator  A-I  is  uniformly  in 
the  unit  sphere.  Hence,  A  is  strongly  positive.  o 

For  the  special  class  of  strictly  convex  team  problems  (cf.  §3)  with 
multiple  probability  measures,  several  simplifications  can  be  made.  In  this  case 
eq.  (31)  simplifies  to 


'  v 


.  V 


Y(yi}  =  D12D12  {E  [E  ly2J  +  g1(y1)E2[g2(y2)  {E1[y(y1)|y2] 

-  E2[v(y;L)  |  y  2  ]  }  |Yl]}  +  F^E1  [x  jy  L  ] 


(41) 


-2r  2 


+  d12F28  (VE  [8  (y2)  (E  [  x  |  y  2  ]  -  E  [x|y2]}  jyj  +  D^F^E1  [E2  [x  |  v.J  |  ]  , 

and  in  -cniz-on  (3)  inequalities  (40a)  are  replaced  by  the  single  inequalitv 


i); 
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*  *  * 

lr(DnD12DnD12)!  -<<Di2Dl12”l-"  i  b1-°3 

where  p^  can  be  taken  to  be  less  than  one.  Hence,  (39)  reads 


(2p2+p^)  <  1/  p 


(42) 


We  now  summarize  these  results  as  a  corollary  to  Thm.  2: 

Corollary  3.  Under  Conditions  (i.'-(2)  of  §2,  and  (42)  given  above,  the  strictly 
convex  quadratic  team  problem,  with  multiple  probability  measures  and  asymmetric 
mode  of  decision  making,  admits  a  unique  Stackelberg  equilibrium  solution 
(  f S >T2 [yS]  ),  where  ySGr^  is  the  limit  of  the  iterative  scheme  (35)  with 

■k 

(2y)(  yx)  =  Di?Di2[(pi[i  +  piji)y^yi)  ~  |y23  lyi^  « 

and  is  given  by  (22).  D 


Remark  3.  When  the  original  problem  is  a  Stackelberg  game,  but 

measures  are  identical,  a  study  of  the  original  condition  (36) 

*  *  * 


r(  =  )  1  r(Dj2D^ 


+  D21D12  *  D21D22D21>  1  0  < 


the  probability 
reveals  the  inequality 


This  is  the  existence  condition  associated  with  the  standard  stochastic  Stackelberg 
game,  which  corroborates  the  earlier  result  obtained  in  [25]. 

We  now  conclude  this  section  by  presenting  the  counterpart  of  Corollary  2 
in  the  present  context,  which  provides  a  set  of  (simpler)  sufficient  conditions  for 
(40b)  to  be  satisfied: 

Corollary  4.  For  a  given  pair  (p2>o4),  the  first  and  second  inequalities  of  (40b) 
are  satisfied  if,  respectively. 


1  ,  ?  2 

g  (y,)  J  g  (i)Pv  ,y 

Y„  y2 1 y 1 


(dr.  |  c,  =  y1) 


(43a) 


and 


g1  (v  )  T  |g2(n)l2P2  (dn!c  =  v.)  /  g1(b)P2 
8  (>ly„  y2 ] yi  1  Y  yi 


(db  |  y  =  ’;) 


2.  2. 

are  uniformly  bounded  from  above  by  (n2)  and  (o^)  ■ 


(43b) 


Furthermore,  if  probability  densities  exist  (with  respect  to  the  Lebesgue 

measure) ,  these  conditions  can  be  expressed  in  terms  of  the  corresponding  probabilit 

density  functions  p1  (•)  as  follows: 

yly2 


2  1  ,  * 

p  (y  )  r  p  (n) 

yl  1  J  y2  2  ,  2 

1—  Y2  -2—  Py  |y  (n|yi)dn  <  (P2) 

P  v  (y, )  P  (n)  1 

yl  1  y2 


(44a) 


Pv  (yi)  pi (r,)  2 
yi  1  j  Zi _ 

pyx(VY2  py2(p) 


p;  o» 


_  /  ~T -  Py  I  y  (n  1  y,  )dn  /  - p  2  iv  (b|n)db<(0  ) 

.)  Y9  (n)  y2|yl  1  Y1  p1  (b)  yl|y2  4 

i  ~  y  i  y  i 


(44b) 


?voof.  For  (43a)-(43b)  see  Appendix  C;  (44a)-(44b),  however,  follow  readily  from 

(43a)-(43b).  □ 


w  m  ip  wwi? »  j  p  u  ■’j.ruv*  v*  v;-  «_» niw.mi 
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5.  Jointly  Gaussian  Distributions 
In  decision  and  control  theory,  one  appealing  class  of  probability 
distributions  is  the  Gaussian  distribution,  because  it  leads  to  tractable  problems 
admitting, in  most  cases,  closed-form  solutions.  Indeed  when  the  probability 
measures  of  the  two  DM's  are  identical  and  Gaussian,  equilibrium  solutions  have 
been  shown  to  be  affine  functions  of  the  observations  for  (i)  quadratic  stochastic 
team  problems  defined  on  Euclidean  spaces  [7],  (ii)  quadratic  stochastic  Nash  games 
on  Euclidean  spaces  [8],  (iii)  quadratic  continuous-time  stochastic  team  problems 
[9],  (iv)  quadratic  stochastic  Stackelberg  games  on  Euclidean  spaces  [25],  and  (v) 
quadratic  continuous-time  stochastic  Stackelberg  games  [26].  In  this  section,  we 
investigate  possible  extensions  of  this  appealing  structural  feature  to  the  case 
when  discrepancies  exist  between  the  subjective  Gaussian  distributions,  as 
reflected  in  the  covariances  of  the  random  vectors  •  We  could  also  have 

included  discrepancies  in  the  perceptions  of  the  mean  values,  but  such  a  more 
general  treatment  does  not  contribute  substantially  to  the  qualitative  nature  of 
the  results  obtained  in  the  sequel,  and  besides  it  makes  the  expressions  notationally 
cumbersome.  Interested  reader  could  find  relevant  expressions  for  the  nonzero  mean 
case  in  [27]  . 

We  first  introduce  notation  and  terminology, and  delineate  Conditions  (1) 
and  (2)  of  §2  (§5.1).  Then,  we  study  the  case  of  symmetric  mode  of  decision  making 
in  §5. 2, and  show  that  the  unique  equilibrium  solution  of  Thm.  1  is  linear.  Finally 
in  §5.3  we  treat  the  case  of  asymmetric  mode  of  decision  making,  and  show  that  (in 
contradistinction  with  the  result  of  §5.2)  the  unique  Stackelberg  solution  of  Thm.  2 
is  generically  nonlinear. 

5.1.  Notation  and  Terminology 

1  2 

Let  (x,y^,y?)  be  zero-mean  Gaussian  random  vectors  under  both  P  and  P  , 


with 


covariance  (y^)^) 


i 


i  z 


>  0  ,  under  P1  . 


(45a 


y2yl  y2 


/  x  xy  \ 

covariance  (x.y.^)  =  cov(x,y)  =  Z1  =  >0  under  P*. 

Vz1  Z1/ 
yx  y' 


Z1  Z1 


(45b) 


These  probability  distributions  clearly  satisfy  the  absolute  continuity  condition 


(i Condition  (1))  of  Thins .  1  and  2.  Furthermore,  since 

gi(?)  =  (det  Z*  /det  Z^  )exp  {- 

yi  yi  ' 

i_1  i_1 

W.  A  zj  -  Z  ,  jH 
yi  yi 


(46a) 


(46b) 


the  uniform  boundedness  condition  ( Condition  (2))  of  Thms.  1  and  2  is  satisfied 
whenever 

W  >  0  ,  1=1,2.  (47) 

After  making  these  observations,  let  us  introduce  the  additional  notation 


.  .  J 

N.  AM-?.  -  M"?  ,b7  M-!,  -  Z1 
l  =  li  i]  ]  ji  y± 


(48a) 


B  .  M .  .  +  W . 

3  =  n  3 


Mu 


M21  «22/ 


(48b) 


z1  z1 

y2yi  y2 


(48c) 


q1  A  [det  Z1  .det  ZJ  /det  Z1  .det  B  .det  Z  ] 

=  y .  y .  y .  j  y 


i,l/2 


(48d) 


terms  of  which  we  evaluate  (21a)  [using  standard  properties  of  Gaussian 


distributions]  to  be 


i  r  1 


g  (yi)  EJ[gJ(yj)|yil  =  q  exp  2  yiVr 


We  are  now  in  a  position  to  specialize  the  results  of  Thms.  1  and  2  to  Gaussian 
distributions  and  obtain  some  explicit  results. 

5.2.  Symmetric  Mode  of  Decision  Making 

In  order  to  apply  Thm.  1  to  the  Gaussian  decision  problem  formulated  above, 
we  first  explore  the  satisfaction  of  various  conditions  given  there.  We  have  already 
shown  above  that  Condition  (1)  is  always  satisfied  and  Condition  (2)  is  satisfied 
whenever  W^  >  0.  For  the  remaining  condition  we  study  inequalities  (20b).  The  second 
of  these  is  satisfied,  for  a  given  p^,  if  (using  (21a))  expression  (49)  is  uniformly 
bounded  in  y^,  and  this  bound  is  no  greater  than  p^.  For  uniform  boundedness  of  (49) 


it  is  necessary  and  sufficient  that 

N.  >  0 

l  - 

under  which  the  latter  condition  becomes 


(50a) 


i  .  i.2 

q  <  (p2) 


Hence  going  back  to  (20a) ,  the  condition 


<  l/q1  ,  for  at  least  one  i=l,2, 
ii  u  l  1  ’ 


(50b) 


becomes  sufficient  for  (19).  We  are  now  in  a  position  to  state  and  prove  the 
following  theorem: 

Theorem  3.  Let  (47)  hold  for  i=l,2,  and  (50a)-(50b)  hold  for  at  least  one  i.  Then, 
the  quadratic  Gaussian  decision  problem  formulated  in  this  section  admits  a  unique 
stable  Nash  equilibrium  solution  where  u?  =  y?(y^)  are  linear  in  y^,  and  are 


givf  n  by 


Y . (y . )  =  L.y . 
11  11 


i=l,2. 


m , 

Here,  L.:  R  1  -*•  U1  are  bounded  linear  operators,  constituting  the  unique 
solution  to  the  linear  operator  equations 

.-1  .  .-1 

L  v  -  D1  L  S'*  E  v. 

iyt  u  ji  i  yj  ypp'  <S2) 

-1  .-1  .  •  .'I 

-  Iy1y1I>'-;  71  *  y*  -  0. 


Proof.  The  existence  and  uniqueness  of  the  solution  follows  from  Thm.  1,  Corollary  2, 

and  the  discussion  that  precedes  the  statement  of  the  theorem.  The  linearity  of  this 

unique  solution,  on  the  other  hand,  follows  by  noting  that  if  the  pair  (y^^y^0^)  is 

taken  to  be  linear  in  in  (14),  all  the  terms  of  the  sequence  are  linear,  and 

hence  the  limit  (which  exists  as  already  proven)  is  linear.  Hence,  choosing  y^  as  in 

m. 

(51),  where  L.:  K.  1 -»■  U.  are  bounded  linear  operators,  substituting  this  into  (14) 
11  mi 

and  requiring  it  to  hold  for  all  y^SlR  (since  all  probability  measures  are  Gaussian),, 
leads  to  the  unique  relations  (52)  .  D 

(■  ' 

Remark  4.  Thm.  3  above  extends  the  result  of  Thm.  2  of  [8]  on  quadratic  Gaussian 
games  to  the  case  when  a  common  probability  space  does  not  exist  and  the  decision 
spaces  are  not  necessarily  finite  dimensional,  and  shows  that  the  appealing  linear 
structure  prevails  when  there  exists  a  discrepancy  in  the  perceptions  of  the  two  DM's 

of  the  underlying  probability  measures.  The  existence  and  uniqueness  conditions  here 

-  0 

are,  however,  more  restrictive  than  those  of  [8],  and  also  involve  the  probabilistic 
structure  (see  (50b)).  Expression  (21a)  in  the  most  general  case  (and  (49)  for  the 
special  Gaussian  case)  is  not  uniformly  (in  y*)  bounded  by  1,  unless  gi(y^)=g'^  (y^  )=1 
a.e.  P1  and  P~*  (which  corresponds  to  the  case  of  equivalent  probability  measures), 

Yi  yj 

since  R-N  derivatives  (if  different  from  1)  will  be  both  smaller  and  larger  than  unity 

on  sets  of  nonzero  measure.  This  then  implies,  in  view  of  (47),  and  from  (49),  that 

q1  >  1,  i=l,2,  with  the  inequality  being  strict  if  P1  is  not  equivalent  to  PJ  for  at 

yi  y  i 

least  one  i=l,2,  j/i.  In  such  a  case,  even  team  problems,  a  stable  equilibrium  soluti 
may  not  exist,  particularly  if  l/q^  <  ^ijDji^i  <  ^  ^or  at  least  one  i=l,2;  j/i. 

This  indicates,  in  general,  the  presence  of  a  strong  coupling  between  probabilistic  and 


I 


deterministic  elements  of  the  problem  in  terms  of  existence  conditions.  However,  if 
the  discrepancy  between  perceptions  of  the  DM’s  on  the  probability  measures  (measured 
in  terms  of  R-N  derivatives)  is  sufficiently  small,  one  would  expect  q.  to  be  sufficienl 


close  to  unity,  which  ensures  satisfaction  of  condition  (50c)  for  a  fairly  general 
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class  of  quadratic  strictly  convex  Gaussian  team  problems  (since,  ID^D^II^  =  Sd^D^IL  = 
p  <  1,  for  such  team  problems).  For  further  discussion  on  this  point  we  refer  the 
reader  to  [10].  0 

In  the  statement  of  Thm.  1,  the  condition  (47)  places  some  severe  restrictions 
on  the  second  moments  of  the  underlying  distributions  (in  case  a  discrepancy  exists) , 
which  may  however  be  relaxed  if  we  are  willing  to  consider  equilibrium  policies  in  a 
more  restricted  space.  More  specifically,  satisfaction  of  (47)  ensures  that  regardless 
of  what  initial  set  of  policies  the  DM's  start  the  infinite  recursion  (15)  with,  every 
element  of  this  series  is  well-defined,  and  under  (50a)-(50b)  it  will  converge  to  a 
unique  limit  which  is  linear;  in  other  words,  even  if  the  DM's  start  with  nonlinear 
policies,  the  end  result  will  be  a  linear  equilibrium  solution.  The  condition  (47)  is 
restrictive,  because  we  require  (without  imposing  any  constraints  on  the  policy  spaces) 
the  series  generated  by  (15)  to  be  well-defined  even  with  nonlinear  starting  conditions. 
However,  if  we  restrict  the  team  agents  to  linear  policies  from  the  outset,  under 
Gaussian  distributions  (and  following  the  argument  used  in  the  proof  of  Thm.  1)  elements 
of  the  series  (15)  will  be  well-defined  (without  requiring  (47))  and  will  converge  to 
the  equilibrium  solution  provided  that  (50a)-(50b)  hold  for  at  least  one  i=l,2.  This 

line  of  reasoning  then  leads  to  the  following  result  which  we  give  without  a  proof. 

0  _ 

Proposition  5.  Let  r  be  the  class  of  all  linear  policies  in  the  form  (51),  with 

m. 

XX  3  3 

L^:  1R  -*■  U  a  bounded  linear  operator,  i=l,2.  On  x  the  statement  of  Thm.  1 
is  valid  even  if  (47)  does  not  hold  true.  D 

We  now  interpret  these  results  in  the  context  of  two  examples  one  of  which 
is  a  scalar  team  problem  and  the  other  one  is  a  continuous-time  team  problem,  both 
with  multiple  subjective  Gaussian  probabilities. 

Zxcsivlz  1.  Consider  a  family  of  scalar  Gaussian  team  problems,  with 


D„0  =  D“  =  1,  D.,  -  D, 


12 

1,  F  ^  —  f  ^ ^  o  —  ^ 2 *  n-m]_ -m2~ ^ *  and 


,  ,  v ac>j  . 


W  w  •  .•  .  *  ...  *  V*  V  ^ 


To  investigate  the  applicability  of  Thm.  1  to  this  class  of  problems,  let  us  first 
observe  that  condition  (.47)  is  satisfied  if,  and  only  if,  both 

0<y<l  ,  0<n<l  •  (54) 

For  condition  (50a),  we  evaluate  and  require  it  to  be  nonnegative  for  either  i=l 

or  i=2: 

N  =  (ya2>-c2)(l-y) [ya2-c2-(l-u)a&]/{a[y2a2-(l-y)(ya2>-<?2)]}  >  0  (55a) 

or 

N  =  (n^-e2)(l-n)[ni2-e2-(l-n)aib]/{i[n2^2-(l-n)(n^-e2)]}  >.  0  ,  (55b) 

Finally,  condition  (50b)  dictates  either 

ua2\d\2<  n[uV-(l-y)(ya£-c?2)]  (56a) 

or 

n£>2|<3|2<  vi[n2fc2-(l-n)  (r\ab-e2)  ]  (56b) 

provided  that  the  terms  on  the  right-hand-side  are  positive  (if  not,  then  the 

inequalities  will  accordingly  change  direction) . 

The  set  of  values  for  a  ,b  ,c  ,e  ,\i  ,r\  that  satisfy  (54)-(56)  is  clearly  not 

empty.  To  gain  some  further  insight  into  these  conditions,  let  us  consider 
the  class  of  team  decision  problems  in  which  the  discrepancies  between  the  Dlls' 
perceptions  of  the  variances  of  different  Gaussian  random  variables  is  relatively 
small,  that  is  there  exist  sufficiently  small  e^>0  and  c2>0  such  that  y=l-t^, 
r,=l-s2,  and  furthermore  e~e,  and  | c? j  is  considerably  smaller  than  both  <2  and 
o.  Note  that,  when  e^=e2=0,  conditions  (54)-(56)  are  all  satisfied  (note  that 
|<i|<l  because  of  strict  convexity  of  the  objective  functional)  regardless  of 
the  relative  magnitudes  of  e  and  o.  Hence,  when  the  discrepancy  is  only  in  the 
perceptions  of  the  correlation  between  y^  and  y the  scalar  quadratic  Gaussian 
team  problem  always  admits  a  stable  equilibrium  solution.  Now,  for  nonzero, 
but  positive, and  sufficiently  small  e. ,  the  dominating  term  in  (55a)  is 
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N1  '  (m<22-^2)  /u2a3 

which  is  positive,  in  view  of  (53)  and  the  initial  hypothesis  that  |a/<?|>>l. 

Likewise,  is  positive  whenever  0<£2<<:1  and  \b/e\»l.  Furtheraore,  given 
a  d,  0<d<l,  we  can  always  find  e^  and  £2*  both  in  (0,1),  so  that  both  (56a) 
and  (56b)  are  satisfied  whenever  ]j|<d.  Hence,  the  conclusion  is  that  when  the 
deviations  of  the  perceptions  of  the  DM's  from  the  common  Gaussian  probability 
measures  are  incremental  (and  satisfying  (54)),  the  linear  equilibrium  solution 
of  the  Gaussian  scalar  team  problem  retains  its  stability  property  (but,  of  course, 
at  a  different  (possibly  close,  in  norm)  equilibrium  point).  ° 

Example  2.  As  a  second  illustration  of  Thm.  1,  for  infinite-dimensional  decision  spaces 
we  consider  here  a  class  of  stochastic  Gaussian  team  problems  defined  in  continuous  time 
More  specifically,  let  (0 ,T) ,  the  Hilbert  space  of  all  scalar-valued  Lebesgue- 

integrable  functions  on  the  bounded  interval  [0,T],  endowed  with  the  standard  inner 

product  /Tu(t)v(t)dt,  for  u,v€£2.  Furthermore,  let  Y 1  and  Y2  =  R,  and  the  Gaussian 

0  2  _  1 
statistics  have  zero  mean,  and  variances  be  as  given  in  (53).  Let  -  D22  - 

1  2* 

the  identity  operator  on  £2*  and  Dp  =  ^21  Fredholm  operator 


I 

D. „  u  =  /  K(t,s)u(s)ds  (57) 

0 

where  K(t,s)  is  a  continuous  kernel  on  CKt,s£T,  and  finally  let  =  f^(t), 

i=l,2,  which  are  continuous  functions  on  [3,T]. 

Now,  conditions  (47a)  and  (50a)  depend  only  on  the  probabilistic  structure, 

and  are  therefore  again  given  by  (54)  and  (55),  respectively.  For  (50b),  however,  we 

.  ,  1  2 

have  to  obtain  the  counterpart  of  (56),  by  simply  replacing  \CL\  with  the  norm  of  the 

j,  *  J;  T 

ll^ll  l1 

operators  and  Di2Dl°’  resPective^y  •  Since  d7.7  u  =  /  K(s,  t)u(s)ds , 

1~  0 

the  self-adjoint  operator  is  given  by 


a 
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*  TT 

D12D12  u  =  ^ 
00 

T 

K(t,T)K(s,x)u(s)ds  dr  =  /  £(t,s)u(s)ds, 

0 

V*. 

* 

A  } 

4U~ 

where 

K(t,s)  =  /  k(CjT)k(s>t)  dT. 

(58a) 

W 

Let 

TT 

L  =  {//j£(t,s)|2dtds}1/2. 

(58b) 

,v* 

-  * 

00 


*  T  T 

a  a  ,.2  4 


T  T 


Then,  ^12^12  ^1  /  !  /  #(  t  ,s)u(s)ds  j  dt  J  [J|i((t,s)  “ds][/|u(s)  2ds]dt  = 


X  II 


0  0 


0  0 


where  the  second  step  follows  from  the  Cauchy-Schwarz  inequality.  Hence, 

* 

11  D12D12I,1  -  X  ’ 

& 

and  because  of  symmetry  i-s  aLso  bounded  in  norm  by  the  same  quantity. 

This  then  leads  to  the  following  counterpart  of  (56)  :  A  sufficient  condition 
for  satisfaction  of  (50b)  is  either 


or 


2  2  2  2 
ua  A  <  n[p  a  -(1-u) (uab-a  )] 


n b  \  <  p  [n  &2-(l-n)  (r\ab-s2)  ] 


(59a) 


(59b) 


provided  that  the  terms  on  the  right-hand-side  are  positive,  where  X  is  defined 
by  (58a)-(58b) . 

Hence,  under  (54)  and  either  (55a)  and  (59a)  or  (55b)  and  (59b), 
the  continuous-time  static  decision  problem  formulated  above  admits  a  unique 
stable  equilibrium  solution,  and  this  solution  is  given  by  (from  Thm.  3): 

Y°(t,y.)  =  k.(t)y.  ,  i-1,2,  (60) 

where  k^t)  are  continuous  functions  on  [0,T],  satisfying 


ull2. 


•id 


.  / 
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.t 


A 


(60) 


kl(t)  "  W  /  X(t,s)k. (s)ds  -  (a  e/ab)  /  K(t,s)f  (s)ds  -  (a1  /a)f.(t)-0  /6laN 
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ea  T  T 

^2 (*">  ~  ^~aiP  /  ^(s ,t)k„ (s)ds  -  (a  c/ai?)  /  K(s ,t) (s)ds  -  (a2  /&)f  (t)=0. 


(61b) 


Note  that  k^t)  above  stands  for  operator  L±  in  (52),  and  we  have  already  shown 
that  a  unique  solution  to  both  (61a)  and  (61b)  exist  in  C2  [O.T],  under  (54)  and 
either  (55a)  and  (59a)  or  (55b)  and  (59b),  and  this  solution  is  also  continuous. 

Finally,  if  our  interest  lies  only  in  the  existence  of  a  unique  linear 
equilibrium  solution  (not  necessarily  stable) ,  the  required  condition  is  unique 
solvability  of  the  integral  equations  (61a)-(61b) ,  for  which  a  sufficient  condition 


is  [6] 


where  \  is  defined  by  (58b). 


( ea/ab )  \  <  1 


5.3.  Asymmetric  Mods  of  Decision  Making 

To  obtain  the  counterpart  of  the  results  of  §5.2  under  the  asymmetric  mode 
of  decision  making,  we  first  investigate  the  possibility  for  the  unique  solution  of 
Thm.  2  to  be  linear.  Towards  this  end  we  first  observe  that  the  decision  problem  will 
admit  a  unique  linear  solution  if,  and  only  if,  equation  (31)  is  satisfied  by  the 
decision  rule 


y(y1)  =  Ayx  (62^ 

for  some  linear  bounded  operator  A:  ]R  "*^1’  Hence,  using  (31),  A  should  be 
the  solution  of  (by  pulling  A  out  of  the  conditional  expectations) 

Av:  =  D^2D21AE1[E2[y1jy2] |yL]  +  D21D}2Ag1(y1)E2[g  (y,) E1 [y 1 j y 2 ] ! y1 ] 


-  D21D22D221Ag1(y1)E2[g2(y2)E2[y1-/2i;y1] 

JL 

+  F^ixlyJ  +  D2IF2g1(y1)E2[g2(y2)E1[x!,y2]  lyj 

2  „1  1,  2,  r  I  ,  j  i  _  n1  p-2?1 


(63a) 


-  D21D22F2g1(v1)E9[g2(y2)E“[x|y2] ! y^, ]  -  D]_^F:E  [E“  t:< ! y2 ]  I yi 1  ’  VyiGE 


wRwMwS 


Since  the  random  variables  are  jointly  Gaussian  under  both  measures. 


El[*lyti  -  so«yt 


for  some  matrices  and  S^. 


:  i,k,£=l,2  (63b) 

i,£=l,2  ,  (63c) 

In  view  of  this,  (63a)  can  be  rewritten  as 


'■i 


A?i  -  <44ASE2S^  +  Fjs^  +  Dj2FES202S121)y1  +  I  D^D^As],  -  D^D^D^AS^ 


(64- 


*  * 

+  D21F2S02  "  D2lD22F2S02]gl(yl)E2[82(y2)y2lyl] 


This  then  leads  to  the  following  Proposition: 
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1  2 

Proposition  6.  Let  (47)  and  Condition  (3)  be  satisfied,  and  either  P  4-  P  or 

yl  yl 

Then,  the  quadratic  Gaussian  decision  problem  with  asymmetric  mode  of 


P1  i  P2 


decision  making  admits  a  linear  (Stackelberg)  equilibrium  solution  if,  and  only  if, 


m. 


(i)  there  exists  a  bounded  linear  operator  A:  ]R  -+ 


satisfying 


A  *  0l12D21ASUS21  +  FiS01  +  DJ2F2S02S21 


(65a) 


and 


(ii)  this  solution  also  satisfies 


*  *  *  *  * 

21  1  212  2  2J1  2122 

D21D12AS12  ~  D21D22D21AS12  +  D21F2S02  “  D21D22F2S02 


=  0. 


(65b) 


•a 


Fvoof.  Since  the  "if"  part  is  obvious  in  view  of  Thm.  2,  we  verify  only 
the  "only  if"  part  of  the  proposition.  [In  what  follows  we  adopt  the  notation 
S  >_  0  to  imply  that  the  nonnegative  definite  matrix  S  has  at  least  one 


ii 


positive  eigenvalue.]  The  proof  proceeds  by  showing  for  three  exclusive 


a 


1  7  2 

(and  exhaustive)  cases  that  f(y^)  =  g  (y  ) E~ [g“ (y2)y7 i ]  is  a  nonlinear  function  of  > 


(a)  P2  ,  and  P.1  /P2  . 


y2 


yl  yl 


2,1  i 

Here,  g  (y9)=i,  and  g  (y1)=c1  exp  {-  —  y 'W^y^; ,  where  W  _>  0,  and 


.  12 

c  >  0  is  a  constant.  Hence,  f (y  )  =  g  (v  )  S“  v  which  is  nonlinear  since  W  0.  -J 


lit 


(b)  p  ,  p  -p  . 

y2  y2  yl  yl 

Here,  gX(y  )-l,  and  g2(y2>=c2  exp  {-  j  y'V^y.,},  where  W2  >.  0,  and  c2  >  0 
is  a  constant.  In  this  case,  f  can  be  evaluated  to  be 

f(yi)  =  c(V+W2)_1  VS2^  exp  {-  \  y'By^ 
where  V  =  E2{(y2  -  S2.^)  (y2  -  S2.^)  '} 

B  =  s2{vs21  -  s2£v'(v+w2)_1vs21  >  0, 

and  c  is  a  constant.  Since  W2  >_  0,  B  has  at  least  one  positive  eigenvalue, 

and  hence  f(y^)  is  again  nonlinear  in  y^. 
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(c)  P  /P  and  P  . 


y2  y2 


yl  yl 


In  this  case,  following  the  same  lines  as  above,  we  find 

f(yx)  =  c(V+W2)_1  VS2^  exp  {-  \  y£(ft+«1)y1} 

which  is  nonlinear  since  both  B  ■>  0,  >_  0. 

Hence,  in  view  of  the  preceding  analysis,  a  necessary  condition 
for  existence  of  a  solution  to  (64)  is  that  the  last  term  should  vanish 
(i.e.  (65b))  for  an  A  that  solves  (65a) .  c 

Remark  5 .  A  sufficient  condition  for  (65a)  to  admit  a  unique  solution  in  the  Banach 

ml 

space  of  linear  bounded  operators  mapping  R  into  is 

*  *  *  A 

r(D12D21D21D12)  Tr  'S12S21S21S12}  <  1 

which  is  clearly  satisfied  under  Condition  (5).  n 

The  conditions  of  Prop.  6  are  clearly  non-void;  because,  given  the  unique 

12  1  2 

solution  of  (65a),  it  may  be  possible  to  find  F?,  F2,  Sg2  and  Sq2  so  that  (65b)  is 
satisfied.  However,  it  should  also  be  clear  that  satisfaction  of  (65b)  places  some 
severe  restrictions  on  the  parameters  of  the  problem,  which  in  general  will  not  be  met 

1  2  i  ? 

Hence,  it  is  tair  to  say  that,  if  either  P^  /P  or  Pf  ^P“  ,  generically  the  problem 

yl  yl  y2  y2 

does  not  admit  a  linear  equilibrium  solution,  even  if  it  is  a  team  problem;  that  is: 


<  - 


,1  ^2 


Corollary  5.  If  either  p"  ^  or  pJ  *P2  (or  both),  the  quadratic  Gaussian  decision  i 

yl  yl  y2  y2 

problem  does  not  admit  (genetically)  a  linear  Stackelberg  equilibrium  solution.  The  - 
unique  solution,  which  exists  under  (47)  and  Condition  (3),  is  nonlinear.  □  ® 

The  conditions  of  the  preceding  Corollary  involve  only  the  marginal 

£0 

distributions  of  y^  and  y in  the  compliment  of  these  conditions  we  can  derive  the 

cx 

r* 

following  linear  solution: 

1  2 

Proposition  7.  For  the  quadratic  Gaussian  decision  problem,  let  both  P  =P  and 


12  12  12 
P  =P“  (but  not  necessarily  P  =P  ,  and  even  P  =P 


yl  yl 


yl  y2 


yly2X  yly2X 


yly2  yly2)'  Then’  if 


'k  "k  k 

1  2  2  1  1/2  2  1  2  1/2 

2[r(D12D21D21D12)]  +  Cr(D21D22D21)]  <  1 
the  problem  admits  a  unique  Stackelberg  equilibrium  solution  for  DM1  (the 

leader)  which  is  linear  in  y^: 


"V i  (yJ  -  Ay. 


(67a) 


where  A:  R  -*■  is  the  unique  bounded  linear  operator  solving 


Ayl  '  +  +  Fjs0\  + 


?  1  1  2  91999  in 

+  D21F2S02S21  '  D21D22F2S02S21)yl  ’  VyiER  1  > 


(67b) 


and  are  defined  by  (31b)-(31c) ,  and  Sj.  is  defined  by  E1  [x | y±]  =  sj.y.  , 


12  121  2  ->■ 
trooc.  When  P  =P  and  P  =P  ,  g  (y  )=g  (y  )=1  and  hence  Conditions  (1)  and  (2)  of 

yl  y2  y2  y2  1 

Thm.  are  always  satisfied,  and  in  Condition  (3),  p2=P^=l.  Then,  (66)  is  the  counter-/.1 
part  of  (39),  and  hence  existence  and  uniqueness  follow  from  Thm.  2.  Linearity,  on  the~~* 

other  hand,  follows  by  noting  that  if  we  start  iteration  (35)  with  y^  =0 ,  since  V; 

12.  1 

8  (y1)-8  every  term  will  be  linear  in  y^  (see  also  (64)),  and  hence  the  limit 

(which  exists  by  Thm.  2)  will  be  linear.  Then,  substituting  yF(y^)=Ay1  in  (31),  we 

obtain  (67b),  by  simply  letting  g1(y1)=g2(y0)=l  in  (64).  □ 
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When  there  is  a  discrepancy  between  the  DM's  perceptions  of  the  variances 
of  either  or  ,  Prop.  7  will  not  hold,  and  the  problem  will  admit  (generically)  a 
nonlinear  equilibrium  solution,  as  proven  earlier  in  Prop.  6  and  Corollary  5.  In  this 
case,  an  explicit  closed-form  solution  cannot  be  obtained;  however,  an  approximate 
solution  can  be  derived  by  using  the  iteration  (35)  which,  for  the  Gaussian  problem, 
becomes 

,(k+l),„  X  ,  nl 


*  * 

(yl}  =  Di2D21EJ'[E^[YV,t;(y1)|y2]|y1]  +  D2lDi2gl(yl) 


•  E“[g2(y2)E1[Y(k)(y1) |y2l 1 yxl  -  D21D22D21gl(yi) 


(68) 


.  E2[g2(y2)E2[Y(k)(y1)|y2]|y1]  +  (Fjs^  +  D^2F2S22S21)yi 

"k  k 

+  (D21F2S02  -  D21D22F2S02)81<>'i)e2|82<!'2)>'2I='i)  ■ 

If  we  start  this  iteration  with  Y^^(y^)=0,  or  any  linear  function  of  y^ ,  at 

no 

every  iteration  we  obtain  linear  combinations  of  terms  of  the  type  A  y^  and 

B  y^  exp  {-  —  y^V  y^ } ,  where  A  and  B  are  linear  operators,  and 

00 

V  >0  is  an  m^xm^  matrix.  Since  this  is  a  successive  approximation  technique 
under  C.?ndi~icn  (5),  even  stopping  the  iteration  after  a  finite  number  of  terms 

will  provide  a  solution  sufficiently  close  to  the  unique  optimum.  Hence,  generically, 
a  suboptimal  policy  for  DM1,  which  is  sufficiently  close  to  the  unique  solution  of 
(31),  will  be  of  the  form 


Y,(y, )  =  A^y  +  E  B^v  exp 
2,<N  i 


T  1  -„U)  , 

■-2ylV  V 


where  N  is  a  sufficiently  large  integer  (related  to  the  number  of  iterations 
taken  in  (68)),  and  A^  ,  3^,  are  generated  via  the  iteration  (68).  Note 

that  as  N-03  this  solution  will  uniformly  converge  to  the  unique  optimum. 

Yet  another  suboptimal  solution  can  be  obtained  by  restricting  DM1 's 
policies,  at  the  outset,  to  linear  functions  of  y  i.e.  to  the  form  (M) 
where  A  is  a  variable  linear  operator.  DM2 ' s  response  to  any  such  policv  will 


■yyyyyAV, 
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also  be  linear  (in  y^) ,  thus  making  T?  in  (10)  a  linear  operator.  Then,  the 

problem  faced  by  DM1  is  minimization  of  (11) ,  with  y(y  )=Ay1 ,  over  all  linear 

bounded  operators  A.  The  solution  of  this  minimization  problem  will  provide 

DM1  with  a  linear  policy  that  is  (in  general)  inferior  to  the  limiting  solution  of 

1  2 

(68),  unless,  of  course,  g  (y^)=g  (y2)=l  in  which  case  the  two  solutions  will  ^ 

be  the  same  (satisfying  (67b)).  We  do  not  pursue  here  the  details  of  the 
derivation  of  the  best  linear  solution  for  the  general  case  (as  outlined  <  ; 

above)  . 

Furthermore,  it  is  possible  to  work  out  the  various  conditions  for  the  speci'dl 
cases  of  the  scalar  and  continuous-time  team  problems  (formulated  as  in  Examples  1  and 
2)  and  write  down  the  equilibrium  solution  explicitly  whenever  it  is  linear.  Such  an 
analysis  would  routinely  follow  the  lines  of  the  discussion  of  Examples  1  and  2,  and  ',\V 
hence  will  not  be  included  here  mainly  because  of  space  limitations. 
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6.  Discussion  of  Possible  Extensions,  and  Concluding  Remarks 

In  the  preceding  sections,  we  have  developed  an  equilibrium  theory  for  two- 
person  quadratic  decision  problems  with  static  information  patterns,  wherein  the 
decision  makers  (DM's)  do  not  necessarily  have  the  same  perception  of  the  underlying 
probability  space;  that  is,  our  formulation  allows  for  discrepancies  in  the  way 
different  DM's  perceive  the  probability  space.  As  indicated  earlier,  when  such 
discrepancies  exist,  even  team  problems  have  to  be  analyzed  in  the  framework  of 
nonzero-sum  stochastic  games,  and  in  such  a  framework  the  Nash  solution  concept  is  the 
most  suitable  equilibrium  concept  if  the  DM's  occupy  symmetric  (non-hierarchical) 
positions  in  the  decision  process,  and  the  Stackelberg  solution  concept  becomes  more 
meaningful  if  there  is  a  hierarcy  in  decision  making. 

Section  3  of  the  paper  has  provided  a  set  of  sufficient  conditions  for 
existence  and  uniqueness  of  Nash  equilibrium  in  the  case  of  symmetric  mode  of  decision 
making,  with  the  additional  feature  that  it  be  stable.  This  is  an  appealing  feature 
of  the  solution  because,  in  order  to  arrive  at  equilibrium  (as  a  consequence  of  an 
infinite  number  of  response  iterations),  each  DM  does  not  have  to  know  the  subjective 
probability  measures  perceived  by  the  other  DM,  but  has  to  know  only  the  policy  adopted 
by  the  other  DM  at  the  most  recent  step  of  the  iteration. 

In  Section  4  we  have  presented  a  counterpart  of  the  results  of  §3  under  the 
asymmetric  mode  of  decision  making.  The  conditions  derived  ensure  that  the  equilibrium 
policy  of  the  leader  can  be  obtained  as  the  limit  of  an  infinite  sequence  which 
involves  conditional  expectations  under  two  different  probability  measures.  This 
sequence  [(35)  ,  (27)]  is  structurally  different  from  its  counterpart  in  §3  (see  14), 
even  for  team  problems,  and  it  contains  R-N  derivatives  of  the  two  probability 
measures  as  multiplying  factors  (which  are  absent  in  (14)). 

In  Section  5  we  have  shown  that  when  the  underlying  probability  distributions 
belong  to  a  Gaussian  class,  the  Nash  equilibrium  solution  will  be  linear  (affine,  if 
mean  values  are  nonzero)  in  the  available  static  measurements,  with  the  gain  operator 
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satisfying  a  Lyapunov-type  operator  equation  (cf.  Thm.  3).  This  solution  and  the  •'*' 
associated  existence  conditions  have  been  studied  further  in  the  context  of  two 
examples  which  irvolve  scalar  and  continuous-time  stochastic  team  problems  with 

“  « 

multiple  probability  models.  In  developing  a  counterpart  of  Thm.  3  for  asymmetric  j-J 

mode  of  decision  making,  we  have  arrived  at  a  seemingly  surprising  (unexpected)  result— 

. 

the  unique  Stackelberg  equilibrium  solution  being  generically  nonlinear  in  the 
measurements  (even  under  Gaussian  multiple  probability  measures).  This  constitutes  .\i 
the  first  unique  nonlinear  solution  reported  in  the  literature  for  a  quadratic  Gaussian 
static  game  or  team  problem?  It  should  be  noted  that  we  have  not  given  a  closed-form  |'j 
expression  for  this  nonlinear  solution,  but  have  instead  provided  a  recursive  scheme  .. 

ii 

which  generates  admissible  policies  that  come  arbitrarily  close  to  the  optimum  solution 
Several  extensions  of  the  results  presented  in  this  paper  seem  to  be  possibl  •' 
Firstly,  we  should  note  that  the  general  Hilbert-space  framework  adopted  in  this  paper 

a 

and  the  general  solutions  presented  for  the  Gaussian  problems  in  Section  5  (Thms.  3  ^ 

and  4)  apply  to  other  models  also,  such  as  the  ones  similar  to  the  continuous-time  team. 

problem  treated  in  [9]  and  the  Stackelberg  problem  of  [26],  but  with  the  DM's  having 

different  probability  models.  It  is  expected  that  some  explicit  results  (closed-form  ^ 

solutions)  can  also  be  obtained  in  these  cases,  but  this  point  has  not  been  pursued  in 

this  paper  and  is  left  for  future  research. 

Another  possible  extension  of  the  results  of  this  paper  would  be  to  the  clas^ 

of  problems  in  which  the  random  state  of  nature  (i.e.  x)  as  well  as  the  measurements 

(y.)  are  stochastic  processes.  The  general  theories  of  Sections  3  and  4  could  easily  -V 
1  2 
be  extended  so  as  to  encompass  this  class  of  problems  also,  provided  that  the  problem 

*.1 

is  set  up  under  the  right  mathematical  assumptions.  In  particular,  if  the  random 

Reference  [12]  also  reports  on  existence  of  nonlinear  (Nash)  solutions  for  ih 
quadratic  Gaussian  nonzero-sum  games,  but  there  the  nonlinear  solution  is  one  of  many 
solutions  one  of  which  is  linear,  and  is  due  to  nonunique  intersection  of  reaction 
functions  (which  disappears  under  appropriate  conditions)  . 
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variables  are  taken  to  be  Hilbert  space  valued  weak  random  variables,  with  the  inner 
product  satisfying  some  continuity  and  boundedness  conditions  [11],  Thms.  1-4  directly 
apply  to  this  more  general  class  of  decision  problems,  when  interpreted  in  the  right 
framework.  Furthermore,  extensions  to  dynamic  (multi-stage)  problems  is  also  possible, 
by  adopting  the  framework  of  (say)  [8]  for  the  linear-quadratic-Gaussian  problem.  Then 
the  unique  Nash  equilibrium  solution  under  the  one-step-delay  observation  sharing 
pattern  can  be  obtained  by  basically  following  the  approach  of  [8]  and  utilizing  in  the 
recursive  derivation  Thm.  3  of  this  paper  instead  of  Thm.  2  of  [8] .  Details  of  this 
derivation  are,  however,  rather  involved,  and  will  be  reported  elsewhere. 

Regarding  the  Nash  equilibrium  solution,  yet  another  possible  extension  would 
be  to  multiple  decision-maker  problems  with  more  than  two  (say,  N)  DM’s.  Even  though 
the  definition  of  Nash  equilibrium  (cf.  Def.  1)  admits  a  natural  (unique)  extension  to 
such  problems,  that  of  stable  equilibrium  (cf.  Def.  2)  does  not  extend  in  a  unique  way. 
One  viable  alternative  is  to  assume  that  each  DM  reacts  optimally  to  the  set  of  most 
recent  policies  of  all  the  other  DM's,  which  leads  to  a  set  of  N  relations  similar  to 
(9).  In  this  case,  (12)  will  be  replaced  by  N  equations  with  the  right-hand-side 
expressions  involving  N-l  policies  of  different  DM's.  However,  the  line  of  reasoning 
that  took  us  from  (13)  to  (14)  does  not  have  a  counterpart  if  N>2,  and  in  general  it 
is  not  possible  to  obtain  N  recursion  relations  each  of  which  involves  only  one  DM's 
policies  at  consecutive  stages.  Then,  the  counterpart  of  (13)  will  have  to  be  treated 
as  a  "multi-valued"  operator  equation,  in  which  context  an  existence  and  uniqueness 
result  will  have  to  be  established.  This  seems  to  be  a  challenging  problem  whose 
solution  requires  somewhat  different  matheraati"  techniques  than  the  ones  employed  in 
this  paper. 

One  source  of  motivation  for  the  research  reported  in  this  paper  has  been 
(as  discussed  in  Section  1)  the  desire  to  investigate  the  sensitivity  and  robustness 
of  team-optimal  solutions  (in  stochastic  teams)  to  independent  variations  in  the 
perceptions  of  ttie  DM's  of  the  underlying  probability  space  (and,  in  particular,  the 
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probability  measure) .  The  analysis  of  this  paper  indeed  provides  a  framework  for 
such  a  study  when  the  roles  of  the  DM's  are  either  symmetric  or  asymmetric,  since  an  *"* 

a 

equilibrium  theory  has  been  established  in  both  cases  within  an  "e-neighborhood"  of 
the  team-optimal  solution.  Some  further  work  is  needed  in  order  to  determine  the 
"satisfiability"  of  the  several  existence  conditions  obtained  in  the  paper  when  the  _ 
region  of  interest  is  an  e-neighborhood  of  a  common  probability  space,  and  to  further 
extend  the  analysis  to  an  investigation  of  sensitivity  and  robustness  properties  of 
team  solutions  (obtained  under  the  stipulation  of  existence  of  a  common  underlying 
probability  space)  in  this  e-neighborhood. 


w 


An  aspect  of  the  decision  problem  studied  here,  which  is  worth  bringing  fortjH, 
is  that  the  subjective  probability  measures  perceived  by  each  DM  is  fixed  in  advance 
and  the  DM's  do  not  attempt  to  change  their  subjective  priors  during  the  course  of  th^ 

-J 

decision  process.  Hence,  in  this  sense,  the  problem  treated  here  is  categorically 
different  from  the  class  of  problems  treated  in  [18]-[21],  where  the  objective  was  fo^i 
the  DM's  to  arrive  at  a  common  (consistent)  set  of  probabilistic  descriptions  of  the 
unknown  variables.  In  the  symmetric  mode,  there  is,  however,  an  implicit  learning 
process  built  in  the  recursive  process  that  leads  to  the  stable  equilibrium  decision 
rules  for  each  DM,  since  the  DM's  do  not  necessarily  have  access  to  each  other's 
perception  of  the  priors.  r.j 

Yet  another  aspect  of  the  problem  treated  in  this  paper  is  that  the  general 
formulation  could  be  viewed  as  a  multi-modeling  in  multiple-decision  maker  problems; 
however,  as  opposed  to  the  singular  perturbations  approach  of  £22]  —  [24] ,  here  the 
multi-modeling  is  in  the  probabilistic  description  of  the  decision  problem,  with  each 

» 

DM  having  a  different  probabilistic  model  of  the  "rest  of  the  world." 


t  r% 


In  this  appendix  we  state  a  number  of  results  concerning  the  spectral 


radii  of  linear  bounded  operators. 

Let  A:  r-*T  and  B:  r-*T  be  two  linear  bounded  operators  where  T  is 
a  Hilbert  space  equipped  with  the  inner  product  <•>.  Then  the  spectral  radius 
of  A  is  defined  by 


r(A)  =  lim  sup  [<<A^>>] 
k-*» 


(A-l) 


where  <<A>>  is  the  norm  of  A,  given  by 

1/2 

<<A>>  =  sup  [<Ay,Ay>  / <Y»Y>]  •  (A-2) 

Y«r 

For  self-adjoint  operators  there  is  an  equivalence  between  the  spectral 
radius  and  norm  of  an  operator;  specifically,  if  A  is  self-adjoint. 


r(A)  =  <<A>>  *  sup{ I <y ,Ay> | /<Y ,Y>  } 


(A-3) 


[see[13]fp.  514],  However,  for  operators  which  are  not  self-adjoint,  such 
an  equivalence  does  not  exist,  and  one  can  only  provide  bounds  on  r(A) : 


Lemma  A.  1. 


For  any  linear  bounded  operator  A, 


r  (A)  <•  <  'A> >  =  [r(A*A)]1/2 


Prcc~\  Since  A  belongs  to  a  Banach  algebra,  <<A  >>  •-  |  <-'A>>  |  and  hence 


r(A)  <  lim  sup  {  J  =  ■  A>> 


k-«° 


Furthermore,  .  ,  .,1/2 

’  <  A>>  =  sup  [ < Y , A  Ay>/<y,Y>] 

>€r 

*1/2  * 

which  is  fr(A  A)]  by  (A-3)  because  A  A  is  self-adjoint. 


Lemma  A. 2.  Let  A  and  B  be  two  linear  bounded  operators  which  commute.  Then, 


(i)  r (AB  +  A*B*)  <_  2  [r (AA*)r (B*B)  ] 1/2  =  2[r(A*A)r(BB*)  ]1/2 

(ii)  r(AB)  <  r(A)r(B) 

Pvoof.  (i)  Since  AB  +  A  B  is  self-adjoint,  using  (A-3) 

r (AB  +  A*B*)  =  sup ( |  <y , (AB  +  A  B  )y>|/<Y»Y>^  =  2  sup{j<Ay,B  Y>|/<y,y>} 
y€T  Y^r 


where  the  equality  has  followed  since  A  and  B  commute.  Using  Cauchy-Schwarz  [3] 

inequality,  this  expression  can  be  bounded  from  above  by 


and  performing 

1  2 

=  2 


individual  supremization  we  further  obtain 

1/2 


[ I <Ay , Ay> I 

1/2 

sup 

1  *  *  1 

1 <B  Y.B  Y>  1 

1 

A 

>" 

V 

_ l 

y€r 

<Y,Y> 

_  — 

sup 

y€r 


j<  &  &  i  /  9 

<<A>>  <<B  >>  =  2[r(A  A)r(BB  )] 


the  bound 


where  the  last  line  has  followed  from  Lemma  A.l.  Note  that  this  expression  can 

if  *  *  * 

be  written  in  different  ways  because  r(A  A)  =  r(AA  )  ,  r(BB  )  =  r(B  B) . 

(ii)  Firstly  note  that 

r(AB)  =»  lim  sup  [  <<  (AB)k>>]  ^k  =  lim  sup  [<<AkBk>>] ^k  (*) 

k-*»  k-**> 

where  the  last  equality  has  followed  because  A  and  B  commute.  Now,  since  A,B  belong 
to  a  Banach  algebra,  <<A  B  >>  <  <<A  >>  <<B  >>  for  every  k  <=> 

<*>  [<<AkBk>>]  <  [<<Ak>>  <<Bk>>]^k  =  [<<Ak>>]^k  [<<Bk>>]^k  for  every  k 


and  taking  lim  sup  of  both  sides,  and  using  (*) 

r(AB)  <  lim  sup  {[«Ak>"]1/k  [«Bk»]1/k  <  r(A)r(B) 

k-w 

which  proves  the  desired  result. 


Lemma  A. 3.  Let  A  and  B  be  both  self-adjoint.  Then, 


r (A  +  B)  <  r (A)  +  r(B) 

Proof.  This  follows  from  (A-3)  and  the  triangle  inequality  applied  to  norm  <<•>>.  o 


Appendix  B 


Proof  of  Theorem  1 

Let  us  first  recall  the  following  result  from  functional  analysis  (see,  for 
example  [13,  Chapter  XIII,  Theorem  3]). 

Lemma  B.l.  Let  5  be  a  linear  bounded  operator  mapping  a  Hilbert  space  r  into  itself, 
and  consider  the  equation 


Y  =  Sy  +  U 

defined  on  F .  Furthermore,  consider  the  "successive  approximation" 

,  k=0 , 1 , . . . 


(k+1)  .  -  (k) 

Y  =  p  +  Sy 


(B-l) 


(B-2) 


to  the  solution  of  (B-l) .  Then,  the  sequence  generated  by  (B-2)  converges  to  a 
unique  element  of  T,  for  any  starting  point  y  ,  which  is  further  a  solution  of 

(B-l),  if,  and  only  if,  the  spectral  radius  of  $  is  less  than  unity,  i.e.  there 
exists  a  p,  0<p<l,  such  that 


r(S)  <  P  .  (B-3) 

□ 

Now,  applying  this  Lemma  to  our  problem,  we  identify  5  with  either  or  S9 
(given  by  (16)),  T  with  or  the  successive  approximation  (B-2)  with  (14),  and 
condition  (B-3)  above  with  (19)  for  either  i=l  or  2.  Then,  the  statement  of 
Thm.  1  (i)  readily  follows  from  the  preceding  Lemma,  in  view  of  Prop.  2. 

Furthermore,  since  can  be  written  as  the  product  of  two  commuting 
operators,  using  Lemma  A.2(ii)  we  obtain 


r,(S.)  =  r, (DXP  , . )  <  r.(D1)r.(P., .). 


44 


Under  (20a)  this  can  be  bounded  from  above  by  =  P  <  1»  thereby  ensuring  (19). 

On  the  other  hand,  since  the  spectral  radius  of  a  bounded  linear  operator  is  bounded 
from  above  by  its  norm  [13],  and  that  ||  D  l|  ^  =  <<D1>>i  because  D*  also  maps  into 

(in  addition  to  being  a  mapping  from  into  itself),  (20b)  follows.  This  completes rj 
the  proof  of  Thm.  1.  □ 

?! 

Appendix  C 

1 .  Proof  of  Corollary  2  (Section  3) 

Here  we  verify  that  the  second  inequality  of  (20b)  is  implied  by  the 
condition  that  (21a)  is  uniformly  bounded  by  p£.  Towards  this  end,  we  first  have, 
for  each  y^P^*  from  the  Cauchy-Buniakowski  (Schwarz)  inequality  [3]  applied  to 

^ili^li  “  11/  Py  I y  <dr> I y i> /  Y(5)pJ  |  (d5|n)|J  <1/  y(5)pj  ,  (ds|n)||* 

1  i  Y  Yj|yi  YJ_  yi|yj  1  Y.  yilyj  1  Cl 

-  /  (/  y(S)P~|  (d€ln)  ,  /  y(€)pJ  | „  (dc  In)).  P^  (dn)gJ(n)  2 

Yj  Y.  yi  Y.  yi'yj  yj 

where  the  last  equality  has  involved  a  change  of  measures,  using  the  R-N  derivative 
(n) .  Now,  again  using  the  Cauchy-Schwarz  inequality,  this  expression  can  be  bounded 
from  above  by 

<  /  /  (y(C) »(y(C)) .P^  I  (d^ln)g^ (n)P^  (dn)  \$ 

~  Y.Y±  yiiyj  yi  '.--- 

,«** 

=  /  (y(0,(y(5))1P^  „  (d5)giCO  /  pj  I  (dn|Ogj(n)  ,  ;$ 

Y.  yi  Y.  yj  ‘yi 

1  J 

where  the  last  equality  has  followed  from  Bayes  Theorem.  It  now  readily  follows  thatiaii 
under  the  condition  of  Corollary  2,  the  last  expression  is  bounded  from  above  by 
<  p^|  yII  thus  proving  the  desired  result  for  i=l,2.  □ 

‘ .  % 

2.  Proof  of  Corollary  4  (Section  4)  js| 

2 

The  fact  that  uniform  boundedness  of  (43a)  (by  (P2)  )  implies  the  first 

v* 

inequality  of  (40b)  follows  readily  from  the  proof  given  above,  since  the  spectral 

4.  r*i 

In  the  following  we  have  abused  the  notation  and  have  used  H  * II  i  to  stand  also  for  ;~i 
the  natural  norm  derived  from  but  this  should  not  create  any  source  of  ~~ 

confusion . 


radius  of  P^|  ^  "l  1 1  e<lual  t0  c^e  square  of  the  norm  of  P-j.li*  Now,  to  verify  that 
uniform  boundedness  of  (43b)  implies  the  second  inequality  of  (40b)  we  follow  basically 
the  same  line  of  reasoning,  but  the  details  of  the  proof  are  more  involved.  Towards 
this  end  we  first  note  that  for  each 

IlKyll^  =  llg1(C)  /  P2  |  (dn|y1=C)g2(n)  /  P2  i  (db |y?=n)y(b)U 2 

Y„  y2|yl  Y,  yll>2 


=  »  /g\o  /  P2  |v  (dniy.=Og2(n)  /  ?2  ,  (db|y 2=n)y(b) 

”  y^\y  ^  -L  Y  ^1 '  ^ 


Y2  y2 ' yi 


i'Vg  (5)g  (n)  /  P„  ,  (db|y.=n)y(b)li; 

Y  yliy2 

where  the  second  equality  follows  from  a  change  of  measures,  and  the  last  bound 
follows  from  the  Cauchy-Schwarz  inequality.  It  should  be  pointed  out  that 


here  we  have  abused  the  notation  and  have  used 


to  mean 


llm(C,n)B  ^  =  {/  /  (m(5 ,  n)  ,m(£! ,  n) ).  P2  (dCxdn)}1//2 
Y  Y2  1  yly2 

where  m  is  a  (y^y^  ~  measurable  random  variable  taking  values  in  ;  hence,  the 

sub-index  "2"  indicates  that  the  probability  space  is  the  one  determined  by 
the  subjective  probability  measure  of  DM2. 

Now,  the  latter  bound  can  further  be  bounded  above  by 

iff  g1(?) I g2(n) ! 2P2  (dCxdn)  /  P2  ,  (db  |  y  =n)  (y  (b)  ,y  (b)  ) 

Y, Y_  yly2  Y,  yl|y2  Z  1 


since  (i) 


(/  P2  (dbly  =n)Y(b)  ,  /  P2  ,  (db|y  =n)y(b)) 

Yl  yly2  Y1  yl 1 y2  Z  1 


<  /  P  j  (db|y  =n)(y(b),y(b))1 
Yx  yl;y2  z  1 


V*» 


by  the  Cauchy-Schwarz  inequality  (because  P  i  is  also  a  probability 

yl'y2 

12  2  x 

measure),  and  (ii)  gx(5)!g  (n) |  >  0.  Hence,  by  interchanging  the  variables 


5  and  b. 


HKyll^  1  /  /  /  (y(5).y(S)),  g1(b)  |  g2(n)  |  2P2  (dbxdn)P^  |  (dn|y.,“C)P^  (dO 


Y  Y  Y 
12  1 


y2 '  yl 


P  (dn) 


=  /  P*  (d£)  (y (£)  *Y (5) ) -I  /  /  g1(Og1(b)  |  g2(n)  1  2 

«  Yi  v  v 


yi 


•  Py1|y2Wb|y2-t'>py2!y1(dr'l’’i-«) 


and  under  (43b)  this  can  be  bounded  above  by 


1  /  Py  (dO(y(C),Y(C))1  P \  = 


Yi  yi 


P4M* 


which  completes  the  proof. 


Appendix  D 

.  Derivation  of  First  and  Seoond  Gateaux  Variations  [ (25)— (26) ] 

Starting  with  the  expression  for  J  as  given  by  (23)  ,  we  first  obtain 

dJ(yjh)  =  J(y+h)  -  J(y) 


=  \  <h,y>1  +  |  <y,h>1  +  j  <h , h> 1 


1  2 


+  j!  { (F^E2[XJ  y2]  +  D^1E2[y(y1)!y2],  E2[h(y1)  |y2J), 


+  ^21^*"^^yl^y2^’  ^22^2  ^  ^22^21  ^ 

+  (D21E2[h(y1)iy2],  D22D21E2[h(yi)|y2])2}  P^(d£) 


-  <h,E  [F1x|y1]>1-/  (D21E2[h(y1) |y2] ,  F2x)2  P^dx.Y^d?) 

XxY  2 

-  <h,E1[Dj2D21E2[Y(y1)|y2]  yL]  +  E1[Dj2F2E2[x|y2] |y1]>1 


-lr«l  .2  J2 


-  <Y,Ei[Dj2D^1E"[h(y1)|y2]|y1]>1  -  [E2  [h^)  |y2]  |y13>1 


=  6J(y;h)  +  5  J (y  ;h)  . 

2~ 

Now,  since  6j(y;h)  is  homogeneous  of  degree  one,  and  6  J(y;h)  is  homogeneous  of  degree 
two,  AJ(y;h)  admits  a  unique  decomposition  with  the  corresponding  expressions  being 
(after  some  simplification) 

* 

5J(y;h)  =  <h,y>1  +  /  (E2 [h(y1) | y 2 ] ,  d2;ld22{f2e2  I Y21 

Y2 

+  D21E2[y(y1)|y2]})1  P*  (d«  -  <h,E1[Fjx|y1]>1 

-  /  (E2 [ h (y ., )  |  y _ ] ,  D2*FEx)  PX(dx,Y  dC)  (D-l) 

XxY2 

..  „1  172-,!  rr-2  r  I  .  1  |„  ,rT  i_  t>2  ~1  _ 


2*  1* 


-  <h,Dj2D‘1P1 | lY(yi)  +  d12F2E  tE  [x|y2]|yl]>l  '  <Pl|lh’  D21D12Y>1 


* 

■J(y;h)  =  Y  <h,h>L  +  Y  /  CE2 [h(yi) i y2] , 


E2|h(y1)|y2])1  P^dO  -  •'l>,Dj2D21P1;1h>1 


(D-2) 


where  we  have  used  some  properties  ^f  adjoint  operators  under  inner  products, 

and  the  notation  introduced  in  (28);  we  have  also  made  use  of  the  fact  that  the 

1  2 

bounded  linear  operator  D^D,^:  commutes  with  the  double  conditional 

expectation  operator  P^j^  (or  PjJ^)* 

We  now  prove  a  lemma  which  will  be  used  in  simplifying  these  expressions 

further . 


Lemma  D . 1 .  For  h(*)eU^  ,  f(-)eU2> 


/  (E2[h(yi)  |y2=C] ,  f(0)x  pJ  (dC)  = 


(D-3) 


/  (h(n),  g1(n)E2[g2(y2)f(y  )|y1=n])1  P*  (dn) 

Y2  1 


=  <h,g1(y1)E2[g2(y2)f (y2) !y1]>1 


where  g1(‘)  are  given  by  (2) 


Proof.  The  proof  follows  from  the  following  set  of  equalities  where  we  are  allowed  to 
change  orders  of  integration  because  11^  and  11^  are  Hilbert  spaces  of  random  variables 
well  defined  under  both  measures: 

/  (E2 [h(y  ) |y  =£],  f(E))  P1  (dC)  =  /  (/  h(n)P2  .  (dn | O ,  fU))^  (dO 

XT  J.  £  -L  y  O  XT  XT  Y  ^  |  9 


Y  Y 
2  1 


/  /  (h(n) ,  f(0).P2  i  (dnU)pJ  (d?) 

Y2  Yx  yliy2  y2 


yl|y2 


-  /  /  (h(n),  f (5)),P^  i  (d?|y  =n)g1(n)g2(OPi  (dn) 

Y2  Y  y2  yl  yl 

where,  in  the  next  to  the  last  line,  we  have  used  continuity  property  of  inner  product 
2  , 

in  pulling  out  P  j  (dy_  j  C) •  Now,  pulling  the  integration  over  Y?  into  the 

yl l y2  1 

inner  product,  we  further  obtain 

=  /  P*  (dn)t(h(n),  /  g2(0  f (£) P2  ,  (d?|y  =n))  g1(n) 

y  'i  v  L  ± 

12" 


=  /  P*  (dn)(h(n),  gL(n)  E2[g2(y_)f (y  ) { y, =n ] ) , 
Y  yl  L  z  1 


which  is  the  desired  result. 


Now,  using  (D-3)  in  (D-2)  we  obtain 


ft 

2J(7;h)  =  j  <h,h>  +  j  f  (h(n) ,  g1(n)D21Di2D21E2[g2(y9)  E2[h(y1’)  |  y2  ]  !  >*1=r 


,.P  (dn) 

1 
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2  &  * 

-i  |lh>1  -  i  <h,D^Dj2^|lh>1 


which  verifies  (26) . 


To  verify  (25)  ,  we  apply  the  result  of  Lemma  D.l  to  (D-l)  to  obtain 


<5J(y;h)  =  <h,y>1  +  ^.g^y^D2^  {F2E2[g2(y2)E2[x|y2]  jy^ 
+  D^1E2[g2(y2)E2fy(y1)|y2]!y1]>>1  -  <h,F^E1[x|y1]>1 


-  <h,g1(y1)D21F2E2[g2(y2)E1[x|y2]|y1]>1  -  <h»  ^2D21P1 1 1  +  D21D12P1 1 1} Y>1 


1  ~2  - 


*  * 

,2  1  -* 


-  <h,DP2F2E1[E2[x|y2] iy1]>1  2  <h,Y>1  -  <h,2y>1  -  <h,B>1 
where  2  and  6  are  defined  by  (27a)  and  (27b),  respectively.  This  then  completes 
the  verification  of  (25)  and  (26). 

2.  Derivation  of  an  Expression  for  P^j^,  the  aa,~ozns  of 

Firstly  note  that 

/  (P*  1/(y1),h(y1))1PP  (dyL)  =  /  (y^),  Px  |  >  lPyi(dyl) 

i  E1[(Y(y1),E1[E2[h(y1)|y2]|y1])1]  =  E1[(y(y1) ,E2[h(yx) I y2 ] ) x ] 

where  we  have  used  the  smoothing  property  of  conditional  expectation  under  the 
probability  measure  PP  .  Now,  a  further  conditioning  under  PP  i  yields 


yl!  y2 


=  E1[ (E1[y(y1) |y2] ,  E“ [h(y1) | y2 ] ) 1 ] , 


and  using  ( D— 3 )  [cf.  Lemma  D-l]  this  becomes  equivalent  to 


=  E1[(g1(y1)E2[g2(y2)E1[y(y1)  |  y2  ]  I  y1  ]  ,  My^)^, 


thus  proving  (32).  The  first  expression  in  (32)  follows  by  routine  manipulations. 


In  this  appendix  we  show  that  the  Stackelberg  solution  satisfying 
(10) -(11)  is  indeed  an  equilibrium  solution — the  so-called  strong  equilibrium 
of  a  decision  problem  with  a  modified  (dynamic)  information  pattern.  Towards 
this  end,  let  us  replace  the  original  decision  problem  with  one  in  which  the 
decision  (action)  variables  are  y^sr^  and  y^I^,  ^or  °M1  an<*  ^M2,  respectively, 
and  the  information  pattern  is  dynamic  (for  DM2),  with  DM2  having  access  to  the 
decision  of  DM1.  Let  li^  and  denote  the  strategy  spaces  of  DM1  and  DM2, 
respectively,  under  this  new  information  pattern;  furthermore  denote  their 
generic  elements  by  3^  and  0  ,  respectively.  Now,  since  DM1  has  static 
information,  all  permissible  policies  3^  will  be  constant  mappings:  ■+•  T  ,  and 
hence  11^  =  IL.  For  DM2,  on  the  other  hand,  all  permissible  policies  will  be 
measurable  mappings  62 :  r^-*T2 .  Finally,  let  J^:  be  the  cost  function 

of  DMi,  satisfying  the  boundary  condition 

VW  =  Ji(Yl’V  5  VB1Ey1£ri=U1  ,  (E-D 

where  ^2^2  un:'-clueiy  defined  for  each  y^Sr^  by 

y2  =  S2(y1)  in  r2  .  (E-2) 

s  s 

Now,  let  (Y1,Y2)6r  xr  be  a  Stackelberg  solution  to  the  original  decision  problem 
with  the  unique  mapping  T2  satisfying  (10).  Note  that  T0£U? ,  and  hence 
relabelling  T0  as  B2,  and  y®  as  3®,  in  (10)  and  (11),  we  obtain  in  view  of  (E-l)-(E- 

^l(Sl’e2)  -  ^1(81’B2)  V6ie% 

J2(3i,6®)  i  J2(81*S2)  V(81,32)eU1xU2, 

S  S 

which  clearly  indicate  that  (3  ,30)SU  xll,,  is  a  noncooperative  Nash  equilibrium. 

1  -i  Li- 

This  is,  in  fact,  a  stronger  equilibrium  (called  "strong  equi 1 ibrium"  [17])  because 

s 

the  second  inequality  is  satisfied  not  only  for  3  =3.  ,  but  for  all  2  €11,, . 
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